Global SNR Estimation of Speech Signals Using Entropy and Uncertainty Estimates from Dropout Networks

Aralikatti, Rohith; Margam, Dilip Kumar; Sharma, Tanay; Thanda, Abhinav; Venkatesan, Shankar M.

doi:10.21437/interspeech.2018-1884

Cited by 8 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More recently, efficient strategies have been proposed to incorporate uncertainty estimation into deep neural networks [52,6]. Among them, MC-Dropout [12] and Deep Ensembles [21] are two of the most popular approaches given that they are agnostic to the specific network architecture [1,15,2,26]. More concretely, MC-Dropout adds stochastic dropout during inference into the intermediate network layers.…”

Section: Related Workmentioning

confidence: 99%

“…In order to implicitly model the infinite set F, NeRF employs a neural network f θ (x, d) with parameters θ which outputs the density α and radiance r for any given input location-view pair {x, d}. Using this network, NeRF is able to estimate the color c(x o , d) for any given pixel defined by a 3D camera position x o and view direction d using the volumetric rendering function: (1) where x t = x o + td corresponds to 3D locations along a ray with direction d originated at the camera origin and intersecting with the pixel at x o .…”

Section: Deterministic and Stochastic Neural Radiance Fieldsmentioning

confidence: 99%

See 1 more Smart Citation

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification

Shen¹,

Agudo²,

Moreno-Noguer³

et al. 2022

Preprint

View full text Add to dashboard Cite

A critical limitation of current methods based on Neural Radiance Fields (NeRF) is that they are unable to quantify the uncertainty associated with the learned appearance and geometry of the scene. This information is paramount in real applications such as medical diagnosis or autonomous driving where, to reduce potentially catastrophic failures, the confidence on the model outputs must be included into the decision-making process. In this context, we introduce Conditional-Flow NeRF (CF-NeRF), a novel probabilistic framework to incorporate uncertainty quantification into NeRF-based approaches. For this purpose, our method learns a distribution over all possible radiance fields modelling which is used to quantify the uncertainty associated with the modelled scene. In contrast to previous approaches enforcing strong constraints over the radiance field distribution, CF-NeRF learns it in a flexible and fully data-driven manner by coupling Latent Variable Modelling and Conditional Normalizing Flows. This strategy allows to obtain reliable uncertainty estimation while preserving model expressivity. Compared to previous state-of-the-art methods proposed for uncertainty quantification in NeRF, our experiments show that the proposed method achieves significantly lower prediction errors and more reliable uncertainty values for synthetic novel view and depth-map estimation.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Deterministic and Stochastic Neural Radiance Fieldsmentioning

confidence: 99%

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification

Shen¹,

Agudo²,

Moreno-Noguer³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…To address this limitation, other approaches have explored other strategies to implicitly learn the parameter distribution. For instance, dropout-based methods [9,2,12,6] introduce stochasticity over the intermediate neurons of the network in order to efficiently encode different possible solutions in the parameter space. By evaluating the model with different dropout configurations over the same input, the uncertainty can be quantified by computing the variance over the set of obtained outputs.…”

Section: Again)mentioning

confidence: 99%

Stochastic Neural Radiance Fields: Quantifying Uncertainty in Implicit 3D Representations

Shen¹,

Ruiz²,

Agudo³

et al. 2021

Preprint

View full text Add to dashboard Cite

Depth Rendered novel view RGB-Uncertainty Ground-Truth Depth-Uncertainty Figure 1. Illustration of the results obtained by Stochastic Neural Radiance Fields (S-NeRF).Our method is a probabilistic generalization of the original NeRF, which is able to not only address tasks such as novel-view generation (Rendered novel view) or depth-map estimation (Depth), but also quantify the uncertainty (red color) associated with the model outputs. This is specially important in domains such as robotics, where this information is necessary to evaluate the risk associated with decisions based on the model estimations.

show abstract

“…The reason being that in a noisy environment audio loses its information and thus visual modality can be used to increase the overall performance of voice activity detection as visual modality is independent of noise. To detect that ambient surrounding is noisy, people have used SNR estimation approaches such as [17].…”

Section: Introductionmentioning

confidence: 99%

Real Time Online Visual End Point Detection Using Unidirectional LSTM

Sharma¹,

Aralikatti²,

Margam³

et al. 2019

Interspeech 2019

Self Cite

View full text Add to dashboard Cite

Visual Voice Activity Detection (V-VAD) involves the detection of speech activity of a speaker using visual features. The V-VAD is useful in detecting the end point of an utterance under noisy acoustic conditions or for maintaining speaker privacy. In this paper, we propose a speaker independent, real-time solution for V-VAD. The focus is on real-time aspect and accuracy as such algorithms will play a key role in detecting end point especially while interacting with speech assistants. We propose two novel methods one using CNN and the other using 2D-DCT features. Unidirectional LSTMs are used in both the methods to make it online and learn temporal dependence. The methods are tested on two publicly available datasets. Additionally the methods are also tested on a locally collected dataset which further validates our hypothesis. Additionally it has been observed through experiments that both the approaches generalize to unseen speakers. It has been shown that our best approach gives substantial improvement over earlier methods done on the same dataset.

show abstract

Global SNR Estimation of Speech Signals Using Entropy and Uncertainty Estimates from Dropout Networks

Cited by 8 publications

References 14 publications

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification

Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification

Stochastic Neural Radiance Fields: Quantifying Uncertainty in Implicit 3D Representations

Real Time Online Visual End Point Detection Using Unidirectional LSTM

Contact Info

Product

Resources

About