2020
DOI: 10.1121/10.0001449
|View full text |Cite
|
Sign up to set email alerts
|

Model-based distributed node clustering and multi-speaker speech presence probability estimation in wireless acoustic sensor networks

Abstract: A great challenge in the wireless acoustic sensor network (WASN) based signal processing is to develop robust speech presence probability (SPP) estimation methods, which can work at each time frame and each frequency band. The knowledge of SPP plays an essential role in speech enhancement and noise estimation. Single channel SPP estimation and centralized multi-channel SPP estimation have been well studied. However, few efforts can be found for the distributed SPP estimation for WASN applications with multiple… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(7 citation statements)
references
References 41 publications
0
7
0
Order By: Relevance
“…) Higher SRO estimation accuracy can be obtained by determining the non-integer value λ[i] that maximises |p i Γ,kq [λ]|, via an interpolation method such as a golden section search in the interval [λ max [i] − 0.5, λ max [i] + 0.5], as proposed in [13], and substituting λ max [i] by λ[i] in (16).…”
Section: A Coherence-drift-based Sro Estimationmentioning
confidence: 99%
See 1 more Smart Citation
“…) Higher SRO estimation accuracy can be obtained by determining the non-integer value λ[i] that maximises |p i Γ,kq [λ]|, via an interpolation method such as a golden section search in the interval [λ max [i] − 0.5, λ max [i] + 0.5], as proposed in [13], and substituting λ max [i] by λ[i] in (16).…”
Section: A Coherence-drift-based Sro Estimationmentioning
confidence: 99%
“…which avoids the influence of VAD errors on the results. In practice, the VAD obviously needs to be estimated from the microphone signals [15], [16]. The clock of node 1 is set as the reference, with f s,1 = 16 kHz.…”
Section: Numerical Experimentsmentioning
confidence: 99%
“…An ideal VAD was used to isolate the influence of VAD errors. In practice VAD information may be shared among the nodes [31], using a speaker-selective VAD [32] and/or estimating the speech presence probability in a distributed fashion [33]. All nodes in Fig.…”
Section: B Batch-processingmentioning
confidence: 99%
“…Microphone clustering in ASNs has been previously explored using e.g., coherence-based features [12], [13], eigenvectors [7], divergence of power spectral densities [14] or cepstral features [6], [15]. These clustering solutions and applications, although effective, do not explicitly incorporate privacy considerations and evaluations are confined to shoebox-type room scenarios.…”
Section: Relation To Prior Workmentioning
confidence: 99%