In cities across the world everyday, people use and process acoustic alerts to safely interact in and amongst traffic. With the advent of autonomous vehicles (AVs), the manner in which these new vehicles can use these acoustic cues to supplement their decision making process is unclear. This will be especially important during the prolonged period of mixed vehicles sharing the road. One solution may lie in the advancement of machine learning techniques; it has become possible to "teach" a machine (or a vehicle) to recognize certain sounds. This paper reports on an ongoing project with the objective of identifying emergency vehicles sirens in traffic and alerting the vehicle to take rapid evasive action. In particular, we report on the use of a deep layer Convolutional Neural Network (CNN) trained to recognize emergency sirens. We retrained a CNN (AlexNet) to recognize sirens in real time. To utilize this network, samples from the ESC-50 dataset for environmental sound classification were processed and each converted to a spectrogram. This CNN can be used in conjunction with a microphone array to accurately recognize sirens in traffic and identify the direction from which the emergency vehicle is approaching.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.