In recent years, with the construction of intelligent cities, the importance of environmental sound classification (ESC) research has become increasingly prominent. However, due to the non-stationary nature of environment sound and the strong interference of ambient noise, the recognition accuracy of ESC is not high enough. Even with deep learning methods, it is difficult to fully extract features from models with a single input. Aiming to improve the recognition accuracy of ESC, this paper proposes a two-stream convolutional neural network (CNN) based on raw audio CNN (RACNN) and logmel CNN (LMCNN). In this method, a pre-emphasis module is first constructed to deal with raw audio signal. The processed audio data and logmel data are imported into RACNN and LMCNN, respectively to obtain both of time and frequency features of audio. In addition, a random-padding method is proposed to patch shorter data sequences. In such a way, the available data for experiment are greatly increased. Finally, the effectiveness of the methods has been verified based on UrbanSound8K dataset in experimental part. INDEX TERMS Environmental sound classification, sound recognition, convolutional neural networks, data processing, pre-emphasis, two stream model.
Environmental sound classification is one of the important issues in the audio recognition field. Compared with structured sounds such as speech and music, the time–frequency structure of environmental sounds is more complicated. In order to learn time and frequency features from Log-Mel spectrogram more effectively, a temporal-frequency attention based convolutional neural network model (TFCNN) is proposed in this paper. Firstly, an experiment that is used as motivation in proposed method is designed to verify the effect of a specific frequency band in the spectrogram on model classification. Secondly, two new attention mechanisms, temporal attention mechanism and frequency attention mechanism, are proposed. These mechanisms can focus on key frequency bands and semantic related time frames on the spectrogram to reduce the influence of background noise and irrelevant frequency bands. Then, a feature information complementarity is formed by combining these mechanisms to more accurately capture the critical time–frequency features. In such a way, the representation ability of the network model can be greatly improved. Finally, experiments on two public data sets, UrbanSound 8 K and ESC-50, demonstrate the effectiveness of the proposed method.
Non-intrusive load monitoring (NILM) is crucial because it helps monitor the operating status of electrical appliances online; detailed power consumption data regarding the appliances can then be obtained. However, the identification of resistive appliances that have similar features in a power grid is still a major problem. In this study, the reconstructed image of a voltage-current (VI) trajectory is used as input data for a convolutional neural network (CNN) to classify the appliances, particularly resistive appliances. Two dataset PLAID and IDOUC are introduced to verify the performance of the proposed method. According to the results, the excellent performance of the reconstructed VI image method for the identification of the household appliances with similar waveform is validated by comparing it with the other two methods.
Non-intrusive load monitoring provides real-time monitoring of the operational status of individual devices in the home and provides detailed power usage data. In this paper, a deep neural network structure with residual module and Batch Normalization layer is proposed for the problem that it is difficult to extract complete features. The new method of converting power data into two-dimensional image is used to identify electricity device. Finally, the accuracy rate of the test in the data set containing 21 kinds of electrical equipment is 97.2%. The experimental results show that the method has high recognition for a large number of household electrical equipment, especially for appliances with multiple states and fewer samples rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.