Power level difference as a criterion for speech enhancement

Yousefian, Nima; Rahmani, Mostafa; Akbari, Ahmad

doi:10.1109/icassp.2009.4960668

Cited by 10 publications

(15 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since the distance between the primary and secondary microphone is distinct but short in a near field, as shown in Fig. 1, the signal power received at the primary microphone close to the mouth shows a stronger signal compared to the signal power of the secondary microphone, while the level of the noise signal at each microphone is almost identical [33], [34]. Based on this, we firstly define the difference of the signal power between the primary microphone and the secondary microphone as so that is derived for the primary and secondary microphone such that (7) where .…”

Section: Review Of Pldmentioning

confidence: 93%

“…Detailed specification will be given in Section IV. Based on these conditions, we first define the noisy signal received at the microphones by , where denotes the microphone index and is the sample index such that (1) where is the convolution operator, is the main source signal, is the impulse response associated with the microphones, is the noise-free reverberant speech, and is the noise component at the each microphone, respectively [33], [34]. The above equations could be changed frame-by-frame into a frequency domain by taking the discrete Fourier transform (DFT) which length is bigger than the frame size as follows:…”

Section: Review Of Pldmentioning

confidence: 99%

“…Indeed, the algorithm based on the power level difference (PLD) is useful with highly non-stationary noises. However, the performance of the technique is sensitive to the noise level and noise types if the PLD is used in the criterion of the gain for speech enhancement as in [33]. In particular, Jeub et al [35] derived the normalized difference of the PSD of the noisy speech to update the noise PSD.…”

Section: Introductionmentioning

confidence: 98%

“…In addition, the methods proposed in [33]- [36] utilize the difference in the power of the signals received at the two microphones. These techniques rely on the fact that the speech signals emitted from the source (i.e., mouth) have different power levels at the microphones, while the power levels of the noise signals are almost identical.…”

Section: Introductionmentioning

confidence: 99%

“…For this reason, we first propose a long-term PLDR (LT-PLDR) using a large long-term smoothing parameter for calculating the PLDR. While our approach is based on the PLD proposed by Yousefian et al [33], [34], we offer the PLDR that can produce robust and superior performance under various noise environments. Specifically, the PLDR is defined by the ratio of the PLD of the input signals and the PLD of noise estimated during speech inactivity.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Dual-Microphone Voice Activity Detection Technique Based on Two-Step Power Level Difference Ratio

Choi

Chang

2014

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

In this paper, we propose a novel dual-microphone voice activity detection (VAD) technique based on the two-step power level difference (PLD) ratio. This technique basically exploits the PLD between the primary microphone and the secondary microphone in a mobile device when the distance between the microphones and the sound source is relatively short. Based on the PLD, we propose the use of the PLD ratio (PLDR) instead of the original PLD to take advantage of the relative difference between the PLD of speech and the PLD of noise. Indeed, the PLDR is obtained by estimating the ratio of the PLD between the input signals and the PLD between the two channel noises during periods without speech. The proposed technique offers a two-step algorithm using the PLDRs including long-term PLDR (LT-PLDR), which characterizes long-term evolution and short-term PLDR (ST-PLDR), which characterizes short-time variation during the first step. LT-PLDR-based and ST-PLDR-based VAD decision are performed using the maximum a posteriori (MAP) probability derived from the model-trust algorithm and combined at the second step to reach a superior VAD decision for both long-term and short-term situations. Extensive experimental results show that the proposed dual-microphone VAD technique outperforms the conventional two-channel VAD method as well as most standardized VAD algorithms.Index Terms-Dual-microphone, power level difference ratio, two-step, voice activity detection.

show abstract

Section: Review Of Pldmentioning

confidence: 93%