2015
DOI: 10.1016/j.sigpro.2015.03.010
|View full text |Cite
|
Sign up to set email alerts
|

2D Psychoacoustic modeling of equivalent masking for automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 31 publications
0
3
0
Order By: Relevance
“…In order to evaluate the effectiveness of the proposed algorithm, several comparisons are made against the MFCC baseline [1], RASATA-PLP [4], MFCC+MVA [10] and minimum mean-square error (MMSE) spectral amplitude estimator [15]. We added to our comparison two newly effective techniques [27,28], the first technique denoted (TECH1) proposes the implementation of the 2D psychoacoustic models to MFCC features and the second technique denoted (TECH2) investigates the distribution of Mel-filtered log-spectrum of speech signals in noisy environments.…”
Section: Comparison With Other Methodsmentioning
confidence: 99%
“…In order to evaluate the effectiveness of the proposed algorithm, several comparisons are made against the MFCC baseline [1], RASATA-PLP [4], MFCC+MVA [10] and minimum mean-square error (MMSE) spectral amplitude estimator [15]. We added to our comparison two newly effective techniques [27,28], the first technique denoted (TECH1) proposes the implementation of the 2D psychoacoustic models to MFCC features and the second technique denoted (TECH2) investigates the distribution of Mel-filtered log-spectrum of speech signals in noisy environments.…”
Section: Comparison With Other Methodsmentioning
confidence: 99%
“…Speech recognition is becoming effective that yielding a result of over 90% and above word level accuracy for vocabulary speech recognition tasks. However, the accuracy of speech recognition may drop to less than 85% with the effect of noise [2]. Nowadays, this system is very popular in technology where it is applied in medicine, telephone network [3], security devices, ATM machines and computers [4].…”
Section: Introductionmentioning
confidence: 99%
“…A signal is most likely to get masked by another signal with frequency components which are near to, or the same as, that of the signal. When masking event occurs among any two signals which appear at similar instant of time it is called simultaneous masking or spectral masking, on the other hand , signals which are comparatively postponed in time it is called temporal or non simultaneous masking [10].…”
Section: Introductionmentioning
confidence: 99%