The development of transportation technology is increasing every day; it impacts the number of transportation and their users. The increase positively impacts the economy's growth but also has a negative impact, such as accidents and crime on the highway. In 2018, the number of accidents in Indonesia reached 109,215 cases, with a death rate of 29,472 people, which was mostly caused by the late treatment of the casualties. On the other hand, in the same year, there were 8,423 mugs, and 90,757 snitches cases in Indonesia, with only 23.99% of cases reported. This low reporting rate is mostly caused by the lack of awareness and knowledge about where to report. Therefore, a quick response surveillance system is needed. In this study, an audio-based accident and crime detection system was built using a neural network. To improve the system's robustness, we enhance our dataset by mixing it with certain noises which likely to occur on the road. The system was tested with several parameters of segment duration, bandpass filter cut-off frequency, feature extraction, architecture, and threshold values to obtain optimal accuracy and performance. Based on the test, the best accuracy was obtained by convolutional neural network architecture using 200ms segment duration, 0.5 overlap ratio, 100Hz and 12000Hz as bandpass cut-off frequency, and a threshold value of 0.9. By using mentioned parameters, our system gives 93.337% accuracy. In the future, we hope to implement this system in a real environment.