Development of intelligent systems with the pursuit of detecting abnormal events in real world and in real time is challenging due to difficult environmental conditions, hardware limitations, and computational algorithmic restrictions. As a result, degradation of detection performance in dynamically changing environments is often encountered. However, in the next-generation factories, an anomaly detection system based on acoustic signals is especially required to quickly detect and interfere with the abnormal events during the industrial processes due to the increased cost of complex equipment and facilities. In this study we propose a real time Acoustic Anomaly Detection (AAD) system with the use of sequence-to-sequence Autoencoder (AE) models in the industrial environments. The proposed processing pipeline makes use of the audio features extracted from the streaming audio signal captured by a single-channel microphone. The reconstruction error generated by the AE model is calculated to measure the degree of abnormality of the sound event. The performance of Convolutional Long Short-Term Memory AE (Conv-LSTMAE) is evaluated and compared with sequential Convolutional AE (CAE) using sounds captured from various industrial manufacturing processes. In the experiments conducted with the real time AAD system, it is shown that the Conv-LSTMAE-based AAD demonstrates better detection performance than CAE model-based AAD under different signal-to-noise ratio conditions of sound events such as explosion, fire and glass breaking. K E Y W O R D S acoustic anomaly detection, audio feature extraction, convolutional autoencoder, convolutional long short-term memory autoencoder, industrial processes 1 | INTRODUCTION The usage of smart systems in homes, factories, cities, and so forth, become more popular to ease the life of humans, especially in surveillance and monitoring tasks. Therefore, a wide variety of sensory information of different type and nature stemming from vision, audition, force/torque, temperature, energy consumption, power, network, and so forth, are individually or jointly utilized in monitoring tasks. However, processing the signals in real time is a challenging problem for abnormal event detection in dynamically changing environments. The aim of anomaly detection is to distinguish abnormal events from the usual ones. For the new generation of industrial manufacturing systems, monitoring of production with a focus on anomalies is one of the significant capabilities, since abnormal events can affect the quality of manufactured products, deteriorate the continuity and the reliability of the processes and assets (Panfilenko, Poller, Sonntag, Zillner, & Schneider, 2016). Even worse, some anomalies in production processes can endanger the safety of people who use industrial machines in the