2020
DOI: 10.1109/access.2020.3032226
|View full text |Cite
|
Sign up to set email alerts
|

Environment Sound Classification Based on Visual Multi-Feature Fusion and GRU-AWS

Abstract: There are two major questions regarding Environmental Sound Classification (ESC). What is the best audio recognition framework, and what is the most robust audio feature? For investigating above problems, the Gated Recurrent Unit (GRU) network was used to analyze the effect of single features such as Mel Scale Spectrogram (Mel), Log-Mel Scale Spectrogram (LM), and Mel frequency cepstral coefficient (MFCC) as well as multi-feature about Mel-MFCC, LM-MFCC, and Mel-LM-MFCC (T-M) in this paper. The experiment resu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 21 publications
(13 citation statements)
references
References 42 publications
0
13
0
Order By: Relevance
“…The performance of the model has seven evaluators: accuracy (14), sensitivity (15), specificity (16), precision (17), the f1score ( 18), cohen's kappa (19), and the matthews correlation coefficient (MCC) (20). The model was assessed using the evaluation index.…”
Section: ) Model Evaluationmentioning
confidence: 99%
“…The performance of the model has seven evaluators: accuracy (14), sensitivity (15), specificity (16), precision (17), the f1score ( 18), cohen's kappa (19), and the matthews correlation coefficient (MCC) (20). The model was assessed using the evaluation index.…”
Section: ) Model Evaluationmentioning
confidence: 99%
“…Feature extraction from MFCCs was performed using pre-emphasis, windowing, fast Fourier transform, Mel filtering, nonlinear transformation, and discrete cosine transform [15]. The first feature consisted of 40-dimension MFCCs [16,17]. Next, for the second and third features, we calculated the MFCC trajectories over time (delta MFCCs) and the second-order delta of MFCCs.…”
Section: Comparison and Evaluationmentioning
confidence: 99%
“…Our feature engineering process was derived from reference [ 31 ]. Fusing of multi-spectrogram features as one new feature has been proposed to improve sound recognition accuracy [ 31 ]. A total of three features were extracted.…”
Section: Methodsmentioning
confidence: 99%