Speaker identification in the household scenario (e.g., for smart speakers) is typically based on only a few enrollment utterances but a much larger set of unlabeled data, suggesting semisupervised learning to improve speaker profiles. We propose a graph-based semi-supervised learning approach for speaker identification in the household scenario, to leverage the unlabeled speech samples. In contrast to most of the works in speaker recognition that focus on speaker-discriminative embeddings, this work focuses on speaker label inference (scoring). Given a pre-trained embedding extractor, graphbased learning allows us to integrate information about both labeled and unlabeled utterances. Considering each utterance as a graph node, we represent pairwise utterance similarity scores as edge weights. Graphs are constructed per household, and speaker identities are propagated to unlabeled nodes to optimize a global consistency criterion. We show in experiments on the VoxCeleb dataset that this approach makes effective use of unlabeled data and improves speaker identification accuracy compared to two state-of-the-art scoring methods as well as their semi-supervised variants based on pseudo-labels.
It is estimated that around 70 million people worldwide are affected by a speech disorder called stuttering [1]. With recent advances in Automatic Speech Recognition (ASR), voice assistants are increasingly useful in our everyday lives. Many technologies in education, retail, telecommunication and healthcare can now be operated through voice. Unfortunately, these benefits are not accessible for People Who Stutter (PWS). We propose a simple but effective method called 'Detect and Pass' to make modern ASR systems accessible for People Who Stutter in a limited data setting. The algorithm uses a context aware classifier trained on a limited amount of data, to detect acoustic frames that contain stutter. To improve robustness on stuttered speech, this extra information is passed on to the ASR model to be utilized during inference. Our experiments show a reduction of 12.18% to 71.24% in Word Error Rate (WER) across various state of the art ASR systems. Upon varying the threshold of the associated posterior probability of stutter for each stacked frame used in determining low frame rate (LFR) acoustic features, we were able to determine an optimal setting that reduced the WER by 23.93% to 71.67% across different ASR systems.
This study monitored spatially drought in Iran country using distance measurement indicators.Five drought indicators measured from 2016 to 2020 were used: temperature condition index (TCI), vegetation condition index (VCI), vegetation health index (VHI), precipitation condition index (PCI), and standardized precipitation index (SPI). The TCI revealed that the largest percentage of the region classified as "severe drought" occurred in 2020. The VCI indicated that the largest portion of the country (73.30%) experiencing "moderate drought" occurred in 2018.The VHI indicated that vegetation stress increased throughout the region, and areas of severe and moderate drought reached their greatest extents in the mentioned years. Although significant droughts have occurred primarily in the central, eastern, and southeastern parts of Iran, mild droughts have occurred in northern Iran as well. The PCI indicated that rainfall amounts have diminished in most of the country during the period of study. The SPI showed that northern Iran received heavy rain, and was the only region classified as "extremely wet." Most of the rest of Iran were "moderately dry." The analysis of the VHI index for agricultural plants showed that 27.71% of Iran's agricultural regions experienced "critical drought" conditions, this included the provinces of Razavi Khorasan, Hamadan, and Khozestan. The study can provide a baseline for the selection of the most useful drought-monitoring indicators and can enable a deeper understanding of drought in arid and semi-arid regions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.