ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9746170
|View full text |Cite
|
Sign up to set email alerts
|

Joint Far- and Near-End Speech Intelligibility Enhancement Based on the Approximated Speech Intelligibility Index

Abstract: This paper considers speech enhancement of signals picked up in one noisy environment which must be presented to a listener in another noisy environment. Recently, it has been shown that an optimal solution to this problem requires the consideration of the noise sources in both environments jointly. However, the existing optimal mutual information based method requires a complicated system model that includes natural speech variations, and relies on approximations and assumptions of the underlying signal distr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…The optimum gains for the minimum processing NLE (19) are equal for all frequencies within a subband j. This is an important result because most work on subband SNR based enhancement make this assumption for convenience in the joint optimization across all subbands [9], [35], [38], [40], [56]. However, as we show, it can be deduced from (20) that the optimal solution does not depend upon k, and it is optimal to have the same gain across the entire subband when optimizing for each subband individually.…”
Section: Optimization Problem and Solutionmentioning
confidence: 99%
See 2 more Smart Citations
“…The optimum gains for the minimum processing NLE (19) are equal for all frequencies within a subband j. This is an important result because most work on subband SNR based enhancement make this assumption for convenience in the joint optimization across all subbands [9], [35], [38], [40], [56]. However, as we show, it can be deduced from (20) that the optimal solution does not depend upon k, and it is optimal to have the same gain across the entire subband when optimizing for each subband individually.…”
Section: Optimization Problem and Solutionmentioning
confidence: 99%
“…For example, [25]- [28] optimize the glimpse proportion metric [29], whereas [6], [30], [31] use the distortion measure of [32]. One of the most widely used optimization targets for SI enhancement is the Speech Intelligibility Index (SII) [33] or variations thereof, which have been used for NLE in numerous studies, e.g., [34]- [40]. The SII based approaches [34], [35] show good performance but fall behind the stateof-the-art heuristic [19] in subjective tests [10], [11], because the optimization targets do not correlate well with subjective intelligibility across varying noises and degradations [41].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The second class of NLE algorithms is based on the main idea of manipulating the input speech such that a target intelligibility metric is maximized when the noise conditions are known. One of the most widely used optimization targets for SI enhancement is the Speech Intelligibility Index (SII) [25] or variations thereof, which have been used for NLE in numerous studies, e.g., [26], [27], [28], [29], [30], [31], [32]. The SII based approaches [26], [27] show good performance but fall behind the state-of-the-art heuristic [19] in subjective tests [10], [11], because the optimization targets do not correlate well with subjective intelligibility across varying noises and degradations [33].…”
mentioning
confidence: 99%
“…Furthermore, some methods, e.g., [26], [27], require solving an optimization problem in real-time with varying execution time [23]. The SII based solutions also rely on simplifying assumptions about frequency gains being constant across frequency subbands [32]. Recently, deep neural network (DNN) based approaches [34], [35] have been able to optimize more advanced measures such as the extended short-time objective intelligibility (ESTOI) measure [36] that correlate more with subjective tests than the simpler metrics such as SII [33].…”
mentioning
confidence: 99%