2011
DOI: 10.1109/tasl.2011.2109380
|View full text |Cite
|
Sign up to set email alerts
|

Ideal Binary Mask Ratio: A Novel Metric for Assessing Binary-Mask-Based Sound Source Separation Algorithms

Abstract: Abstract-A number of metrics has been proposed in the literature to assess sound source separation algorithms. The addition of convolutional distortion raises further questions about the assessment of source separation algorithms in reverberant conditions as reverberation is shown to undermine the optimality of the ideal binary mask (IBM) in terms of signal-to-noise ratio (SNR). Furthermore, with a range of mixture parameters common across numerous acoustic conditions, SNR-based metrics demonstrate an inconsis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
7
0

Year Published

2013
2013
2017
2017

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 25 publications
0
7
0
Order By: Relevance
“…Given that STOI evaluates the reconstructed signals as a whole rather than each classification decision independently, it holds promise for being able to predict outcomes for masks with different error distributions because it can take into account the perceptual relevance of the errors. Several other metrics have been proposed to assess sound source separation algorithms, such as the loudnessweighted H-FA (Yu et al, 2014), the IBM ratio (Hummersone et al, 2011), and the intelligibility metric based on an auditory preprocessing model (Christiansen et al, 2010). However, these metrics have gained limited traction due to either lacking generalizability or accessibility.…”
Section: Introductionmentioning
confidence: 99%
“…Given that STOI evaluates the reconstructed signals as a whole rather than each classification decision independently, it holds promise for being able to predict outcomes for masks with different error distributions because it can take into account the perceptual relevance of the errors. Several other metrics have been proposed to assess sound source separation algorithms, such as the loudnessweighted H-FA (Yu et al, 2014), the IBM ratio (Hummersone et al, 2011), and the intelligibility metric based on an auditory preprocessing model (Christiansen et al, 2010). However, these metrics have gained limited traction due to either lacking generalizability or accessibility.…”
Section: Introductionmentioning
confidence: 99%
“…In those techniques, estimation of the ideal mask has been treated as a binary classification problem, which was achieved by advanced machine learning methods. Correspondingly, several mask-based objective measures, such as hit minus false alarm rate (HIT-FA) (Kim et al, 2009) and ideal binary mask ratio (IBMR) (Hummersone et al, 2011) have also been developed to predict the intelligibility of binary masked speech. Mask-based objective intelligibility measures are obtained by tabulating the mismatched T-F units between the estimated binary mask and IBM.…”
Section: Introductionmentioning
confidence: 99%
“…Mask-based objective intelligibility measures are obtained by tabulating the mismatched T-F units between the estimated binary mask and IBM. Such objective measures have two major advantages: (i) First, and perhaps the most important advantage, is that the calculation of mask-based objective measures do not require synthesized output, and is thus robust to many convolutional distortions not directly associated with the binary masking algorithm itself (Hummersone et al, 2011); and (ii) second, they allow for evaluation of binary masking techniques in contrast to existing objective intelligibility measures (Goldsworthy and Greenberg, 2004;Kates and Arehart, 2005;Ma et al, 2009) where binary T-F weighting effect has not been considered.…”
Section: Introductionmentioning
confidence: 99%
“…Objective metrics for evaluating the performance of these binary masking algorithms are of great interest, since subjective tests can be time-consuming [5, 9, 10]. …”
Section: Introductionmentioning
confidence: 99%
“…Mask-based objective speech-intelligibility measures such as hit rate minus false alarm rate (HIT-FA) [5] and ideal binary mask ratio (IBMR) [9] were proposed and frequently used as metrics to evaluate the performance of binary masking techniques. Those mask-based objective intelligibility measures are often obtained by counting the mismatched T-F units between estimated binary mask and IdBM.…”
Section: Introductionmentioning
confidence: 99%