Speech keyword search (KWS) is the task of automatically detecting the required keywords in continuous speech. Single-keyword detection can be regarded as the task of speech keyword wake-up. For many practical applications of these small vocabulary speech recognition tasks, it is costly and unnecessary to build a full large vocabulary speech recognition system. For tasks related to speech keyword search, insufficiency in data resources remains the main challenge so far. Speech pre-training has become an effective technique, showing its superiority in a variety of tasks. The key idea is to learn effective representations in settings where a large amount of unlabeled data is available to improve the performance while labeled data of downstream tasks are limited. This research focuses on the combination of unsupervised pre-training and keyword search based on the Keyword-Filler model and introduces unsupervised pre-training into speech keyword search. The research selects pre-trained model architecture Wav2vec2.0 including XLSR. The research results show that training with feature extracted by pre-trained model performs better than the baseline. In the case of low-resource condition, the baseline performance drops significantly, while the performance of the pre-trained tuned model does not decrease but even increases slightly in some intervals. It can be seen that the pre-trained model can be tuned to achieve better performance on very little data. This shows the advantage and application value of keyword search based on unsupervised pre-training.
Out-of-distribution (OOD) learning deals with scenarios in which training and test data follow different distributions. Although general OOD problems have been intensively studied in machine learning, graph OOD is only an emerging area of research. Currently, there lacks a systematic benchmark tailored to graph OOD method evaluation. In this work, we aim at developing an OOD benchmark, known as GOOD, for graphs specifically. We explicitly make distinctions between covariate and concept shifts and design data splits that accurately reflect different shifts. We consider both graph and node prediction tasks as there are key differences when designing shifts. Overall, GOOD contains 8 datasets with 14 domain selections. When combined with covariate, concept, and no shifts, we obtain 42 different splits. We provide performance results on 7 commonly used baseline methods with 10 random runs. This results in 294 dataset-model combinations in total. Our results show significant performance gaps between in-distribution and OOD settings. Our results also shed light on different performance trends between covariate and concept shifts by different methods. Our GOOD benchmark is a growing project and expects to expand in both quantity and variety of resources as the area develops. The GOOD benchmark can be accessed via https://github.com/divelab/GOOD/. * Equal contributions Preprint. Under review.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.