2017 25th European Signal Processing Conference (EUSIPCO) 2017
DOI: 10.23919/eusipco.2017.8081189
|View full text |Cite
|
Sign up to set email alerts
|

Robust distributed multi-speaker voice activity detection using stability selection for sparse non-negative feature extraction

Abstract: Abstract-In this paper, we propose a robust multi-speaker voice activity detection approach for wireless acoustic sensor networks (WASN). Each node of the WASN receives a mixture of sound sources. We propose a non-negative feature extraction using stability selection that exploits the sparsity of the speech energy signals. The strongest right singular vectors serve as source-specific features for the subsequent voice activity detection (VAD). To separate active speech frames from silent frames, we propose a ro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…[86][87][88] Further, by focusing exclusively on prediction error, most feature selection approaches fail to consider that their solutions may be difficult to interpret because different subsets of features can result in similar prediction error. In contrast, feature stability is an indicator of biomarker reproducibility, 85 and stability-based feature selection methods have been highly successful in microarray analysis and chemometrics, [89][90][91][92] as well as other applications 93 . Here, we contribute to this body of work in the context of lesion mapping, showing that the identification of stable features can improve mod- A small group of neuroimaging studies have previously leveraged stability analysis with success outside of lesion mapping and our approach may be relevant to other modalities.…”
Section: Improving Models Trained On Neuroimaging Data Through Identi...mentioning
confidence: 99%
“…[86][87][88] Further, by focusing exclusively on prediction error, most feature selection approaches fail to consider that their solutions may be difficult to interpret because different subsets of features can result in similar prediction error. In contrast, feature stability is an indicator of biomarker reproducibility, 85 and stability-based feature selection methods have been highly successful in microarray analysis and chemometrics, [89][90][91][92] as well as other applications 93 . Here, we contribute to this body of work in the context of lesion mapping, showing that the identification of stable features can improve mod- A small group of neuroimaging studies have previously leveraged stability analysis with success outside of lesion mapping and our approach may be relevant to other modalities.…”
Section: Improving Models Trained On Neuroimaging Data Through Identi...mentioning
confidence: 99%
“…Stability Selection is a very powerful method that can be applied to Lasso or Graphical Lasso models (Meinshausen and Bühlmann [2010]), but which also has been extended to Boosting models in Hofner et al [2015]. The Stability Selection has been applied successfully in the context of gene expression (Meinshausen and Bühlmann [2010], Stekhoven et al [2012], Shah and Samworth [2013], Hofner et al [2015]), fMRI data (Ryali et al [2012]) and voice activity detection (Hamaidi et al [2017]), which are exemplary for having very few observations with a huge number of predictors.…”
Section: Introductionmentioning
confidence: 99%
“…Single channel VAD has been extensively studied [4], [5], but not in the multichannel and the WASN cases. In [6], the VAD problem with WASN is formed as a node clustering problem first, and then the VAD is obtained locally at the clustered nodes. However, the development of distributed VAD method with global decision making strategy is not well studied.…”
Section: Introductionmentioning
confidence: 99%
“…The estimated speech PSD can be obtained in a similar way. Inserting (15) and the speech PSD estimate in (5) and (6), and with the distributed estimation of ln L G (ȳ(k, n)), the decision is made by using (10).…”
mentioning
confidence: 99%