High-performance deep learning-based object detection models can reduce traffic accidents using dashcam images during nighttime driving. Deep learning requires a large-scale dataset to obtain a highperformance model. However, existing object detection datasets are mostly daytime scenes and a few nighttime scenes. Increasing the nighttime dataset is laborious and time-consuming. In such a case, it is possible to convert daytime images to nighttime images by image-to-image translation model to augment the nighttime dataset with less effort so that the translated dataset can utilize the annotations of the daytime dataset. Therefore, in this study, a GAN-based image-to-image translation model is proposed by incorporating self-attention with cycle consistency and content/style separation for nighttime data augmentation that shows high fidelity to annotations of the daytime dataset. Experimental results highlight the effectiveness of the proposed model compared with other models in terms of translated images and FID scores. Moreover, the high fidelity of translated images to the annotations is verified by a small object detection model according to detection results and mAP. Ablation studies confirm the effectiveness of self-attention in the proposed model. As a contribution to GAN-based data augmentation, the source code of the proposed image translation model is publicly available at https://github.com/subecky/Image-Translation-With-Self-Attention
Object detection is one of the most important tasks in computer vision-based automation, such as advanced driver assistance systems in driving automation. It is preferable to detect traffic-related objects at a far distance that appear small in the recorded scene in order to ensure maximum road safety while driving. As drivers tend to miss more traffic-related objects at nighttime driving, this work focuses on nighttime in-vehicle camera images. Because videos were recorded using an in-vehicle camera, objects to be detected in this study, such as traffic signs and pedestrians, occupy a small size in the frame when far away from the own vehicle. Furthermore, it is necessary to take into account time-series information to detect objects in sequential frames. Therefore, this research proposes an object detection model that combines the RefineDet small object detection model and the TSSD video detection model. Experimental results confirm the effectiveness of the proposed model. Moreover, a publicly available benchmark dataset is used to confirm the performance of the proposed model regardless of daytime or nighttime images.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.