Numerous car accidents are caused by improper driving maneuvers. Serious injuries are however avoidable, if such driving maneuvers are detected beforehand and the driver is assisted accordingly. In fact, various recent research has focused on the automated prediction of driving maneuver based on handcrafted features extracted mainly from in-cabin driver videos. Since the outside view from the traffic scene may also contain informative features for driving maneuver prediction, we present a framework for the detection of the drivers' intention based on both in-cabin and traffic scene videos. More specifically, we (1) propose a Convolutional-LSTM (ConvLSTM)-based autoencoder to extract motion features from the out-cabin traffic, (2) train a classifier which considers motions from both in-and outside of the cabin jointly for maneuver intention anticipation, (3) experimentally prove that the in-and outside image features have complementary information. Our evaluation based on the publicly available dataset Brain4cars shows that our framework achieves a prediction with the accuracy of 83.98% and F1-score of 84.3%.
I. INTRODUCTIONAccording to the World Health Organization [2], about 1.35 million people die in car accidents every year worldwide. These statistics, however, do not include non-fatal injuries from traffic accidents. Most of these accidents are caused by improper driver behavior: Based on the statistics from the Department for Transport (DfT) in Great Britain, a survey [6] revealed that there were 15,560 accidents reported due to poor turn or maneuver, which ranked top 5 in causes of road accidents in 2017. As automated vehicle technology emerges, it promised to be safer than human driving [3], [4], [5]. However, there is still much research to be conducted in order to reach to the fully automated level working at any possible traffic situation and weather conditions. On the half way to autonomous driving vehicles, it is therefore necessary to provide already existing Advanced Driver Assistance Systems (ADAS) the functionality for collaboration with the human driver in the most efficient way, for example to alert the driver in case of a dangerous maneuver.Recently, many researchers focused on detecting maneuver intention of the driver before execution. For example, Brain4cars [1] and Honda Research Institute Driving Dataset (HDD) [7] are two datasets specifically designed for learning driver behaviors. HDD for example [7] uses three highresolution video cameras, GPS, signals from LiDAR sensor and vehicle CAN-Bus to record the traffic scenes. Brain4cars[1] provides videos from inside and outside of the car. GPS and vehicle dynamics are also recorded with the videos.