2021
DOI: 10.1007/s12652-021-03468-3
|View full text |Cite
|
Sign up to set email alerts
|

In domain training data augmentation on noise robust Punjabi Children speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 63 publications
0
6
0
Order By: Relevance
“…Environmental sound classification and acoustic scene classification ResNet [45,70,97,112] Autoencoder DNN [98][99][100]120] Snore sound classification, speech emotion recognition, and respiratory sound classification Gated Recurrent Unit (GRU) [24,128,175] Various applications Deep neural network (DNN) [77,121,125,128,146] CNN [32,40,76,81,86,88,96,128,140,157,161,165,167,173,174,177,178,180] Deep CNN (DCNN) [102,111,179] Sound event detection, speaker detection…”
Section: Classification Methods Referencesmentioning
confidence: 99%
“…Environmental sound classification and acoustic scene classification ResNet [45,70,97,112] Autoencoder DNN [98][99][100]120] Snore sound classification, speech emotion recognition, and respiratory sound classification Gated Recurrent Unit (GRU) [24,128,175] Various applications Deep neural network (DNN) [77,121,125,128,146] CNN [32,40,76,81,86,88,96,128,140,157,161,165,167,173,174,177,178,180] Deep CNN (DCNN) [102,111,179] Sound event detection, speaker detection…”
Section: Classification Methods Referencesmentioning
confidence: 99%
“…Class-dependent temporal-spectral structures and long-term descriptive statistics features were extracted for sound events. Other authors applied the Discrete Gabor Transform (DGT) audio image representation [119], multiresolution feature [53], hybrid method based on mel frequency cepstral coefficient and the gammatone frequency cepstral coefficient [62], inverted MFCC and extended MFCC [66], bag of audio words (BoAW) [120], narrow band auto-correlation features (NB-ACF) [121].…”
Section: Feature Extraction Methods In Sound Classificationmentioning
confidence: 99%
“…Despite the shortfalls identified in the selected articles, we also identified that the application of data augmentation methods in sound classification research has shown significant progress in the last five years, between 2017 and 2022. Advancement of classification or recognition of a sound dataset with integration of data augmentation techniques has helped to improve the generalization ability as recorded by the authors in [62,69,72,92,98,103]. Second, the introduction of class-specific data augmentation techniques in imbalanced datasets has helped to overcome the problem of overfitting [67,86,92,104] and thus increasing prediction performance [58,61] and classification stability [59,65,66,69,91,106].…”
Section: Classification Methods For Sound Classificationmentioning
confidence: 99%
“…Consequently, the class probability for particular utterance at a given time of such structure is obtained using a SoftMax nonlinearity using the following equation: where corresponds to an activation function corresponding to output layer at a particular HMM state . Therefore, an optimization of a given objective function is usually trained using a standard error-back propagation procedure [ 13 ]. It is performed by evaluating a natural cost function C as demonstrated in the following equation by utilizing SoftMax output function.…”
Section: Theoretical Backgroundmentioning
confidence: 99%
“…The extracted feature vectors are though well efficient in capturing relevant information while discarding the redundancies originated due to presence of noise in an input speech signal. Therefore, various feature extraction techniques: RASTA-PLP [ 12 ], MFCC [ 13 ], GFCC [ 14 ] and PNCC [ 15 ] have been investigated by various researchers with an effort of deploying an effective noise-robust ASR system. For the past many years, HMM has been a widely adapted modeling technique for efficient learning of parameters corresponding to an acoustic model [ 16 ].…”
Section: Introductionmentioning
confidence: 99%