Designing and Implementing of Intelligent Emotional Speech Recognition with Wavelet and Neural Network

Mansouri, Bibi Zahra; Mirvaziri, Hamid; Sadeghi, Faramarz

doi:10.14569/ijacsa.2016.070904

Cited by 2 publications

(1 citation statement)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additionally, Linear Prediction Coefficients (LPC) and Linear Spectral frequencies (LSF) were used for different applications like speaker recognition [15], spoken digits recognition [16], and emotion recognition from speech [17]. Discrete Wavelet Transform is another feature extraction technique that has been used for speaker recognition [18], speech semantic and emotions recognitions [19,20]. Furthermore, spectrogram images are the best choice of speech and acoustic feature extraction that is suitable for CNN models.…”

Section: Introductionmentioning

confidence: 99%

Design and Implementation of Fast Spoken Foul Language Recognition with Different End-to-End Deep Neural Network Architectures

Wazir

Karim

Abdullah

et al. 2021

Sensors

View full text Add to dashboard Cite

Given the excessive foul language identified in audio and video files and the detrimental consequences to an individual’s character and behaviour, content censorship is crucial to filter profanities from young viewers with higher exposure to uncensored content. Although manual detection and censorship were implemented, the methods proved tedious. Inevitably, misidentifications involving foul language owing to human weariness and the low performance in human visual systems concerning long screening time occurred. As such, this paper proposed an intelligent system for foul language censorship through a mechanized and strong detection method using advanced deep Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) through Long Short-Term Memory (LSTM) cells. Data on foul language were collected, annotated, augmented, and analysed for the development and evaluation of both CNN and RNN configurations. Hence, the results indicated the feasibility of the suggested systems by reporting a high volume of curse word identifications with only 2.53% to 5.92% of False Negative Rate (FNR). The proposed system outperformed state-of-the-art pre-trained neural networks on the novel foul language dataset and proved to reduce the computational cost with minimal trainable parameters.

show abstract