Dissemination of improved package and practices of coriander (Coriandrum sativum L.) in Ambala (Haryana)

Kumar, Amit; Prem, Guru; Singh, Upasana; Chaudhary, Vikas; Kumar, Kapil; Malik, Neeraj Pal

doi:10.5958/0976-4615.2021.00043.0

Cited by 1 publication

(3 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is because the content representation z ′ is not strictly disentangled from the speaker information. To address this challenge, past works (Choi et al, 2021;2023), have proposed an information perturbation based training strategy as follows: Instead of feeding the content embedding of the original audio as the input, the audio is perturbed to synthetically modify the speaker characteristics using formant-shifting, pitch-randomization and randomized frequency shaping transforms to obtain x p = g heuristic (x). Next, the content embedding is derived from the perturbed audio z ′ = G c (x p ), while the speaker embedding is still derived from the original audio s = G s (x).…”

Section: Synthesizer Training: Iterative Refinement Using Self Transf...mentioning

confidence: 99%

“…Deriving meaningful representations from speech has been a topic of significant interest because such representations can be useful for both downstream recognition and upstream speech generation tasks. While some techniques (Défossez et al, 2022;Eloff et al, 2019;Liao et al, 2022;Kumar et al, 2023) aim to compress speech into a data-efficient codec, another line of research has focused on disentangling the learned features into components such as speaker characteristics (voice or timbre), linguistic content (phonetic information) and prosodic information (pitch modulation and speaking rate) (Chou et al, 2019;Qian et al, 2019;Wu & Lee, 2020;Chen et al, 2021;Qian et al, 2022;Hussain et al, 2023). Representation disentanglement allows controllable speech synthesis by training a model to reconstruct the audio from the disentangled features.…”

Section: Introductionmentioning

confidence: 99%

“…To remove speaker information from the SSL model outputs, some techniques utilize an information bottleneck approach such as quantization (Polyak et al, 2021;Lakhotia et al, 2021;Gu et al, 2021). Alternatively, several researchers have proposed training strategies that employ an information perturbation technique to eliminate speaker information without quantization (Qian et al, 2022;Choi et al, 2021;2023;Hussain et al, 2023). Notably, for training synthesizers, NANSY (Choi et al, 2021) and NANSY++ (Choi et al, 2023) propose to heuristically perturb the voice of a given utterance with hand-engineered data augmentations, before obtaining the output from the SSL model.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Universal Adversarial Perturbations for Speech Recognition Systems

Neekhara¹,

Hussain²,

Pandey³

et al. 2019

Interspeech 2019

View full text Add to dashboard Cite

In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems. We propose an algorithm to find a single quasi-imperceptible perturbation, which when added to any arbitrary speech signal, will most likely fool the victim speech recognition model. Our experiments demonstrate the application of our proposed technique by crafting audio-agnostic universal perturbations for the state-of-the-art ASR system -Mozilla DeepSpeech. Additionally, we show that such perturbations generalize to a significant extent across models that are not available during training, by performing a transferability test on a WaveNet based ASR system.

show abstract

Section: Synthesizer Training: Iterative Refinement Using Self Transf...mentioning

confidence: 99%