Spoken language understanding (SLU), which converts user requests in natural language to machine-interpretable expressions, is becoming an essential task. The lack of training data is an important problem, especially for new system tasks, because existing SLU systems are based on statistical approaches. In this paper, we proposed to use two sources of the "wisdom of crowds," crowdsourcing and knowledge community website, for improving the SLU system. We firstly collected paraphrasing variations for new system tasks through crowdsourcing as seed data, and then augmented them using similar questions from a knowledge community website. We investigated the effects of the proposed data augmentation method in SLU task, even with small seed data. In particular, the proposed architecture augmented more than 120,000 samples to improve SLU accuracies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.