These days' multi-intent utterances have become very important for the spoken language understanding (SLU). The multi-intent systems and algorithms add more complexity (Compare to the singleintent-based system) to the SLU. As, it requires an accurate system, which can identify intents and slots at fine-grain (i.e., word/token) level and also able to handle the relation between intents and slots locally at utterance level. In this case, intents may belong to multiple domains and multiple different classes. Similarly, slots may also belong to multiple different classes, and slots of the same class may be related to multiple different intent classes. Unfortunately, very few works have been done till now to address these issues at the fine-grain level. To solve this problem, we propose a smart stacking-ensemble strategy. The first stage of this system uses a combination of three different types of powerful multitasking NLP models, developed on top of pre-trained BERT, XLNet, and Elmo. Finally, a stacking ensemble layer learns to predict the best possible results. We have evaluated our model on four publicly available datasets. The evaluation results on the stateof-the-art public datasets show that our devised system outperforms the existing multi-intent-based systems at token-level and sentence-level.
In Spoken Language Understanding (SLU), the ability to detect out-of-domain (OOD) input dialog plays a very important role (e.g., voice assistance and chatbot systems). However, most of the existing OOD detection methods rely heavily on manually labeled OOD. Manual labeling of the OOD data for a dynamically changing and evolving area is time-consuming and not immediately possible. It limits the feasibility of these models in practical applications. So, to solve this problem, we are considering the scenario of having no OOD labeled data (i.e., zero-shot learning). To achieve this goal, we have used the intent focused semantic parsing, extracted with the help of Transformer-based techniques (e.g., BERT [26]). The two main components of the intent-focused semantic parsing are -(a) the sentence-level intents and (b) token-level intent classes, which show the relation of slot tokens with intent classes. Finally, we combine both information and use a One Class Neural Network (OC-NN) based zero-shot classifier. Our devised system has shown better results compared to the state-of-the-art on four publicly available datasets.INDEX TERMS Spoken Language Understanding, Out-of-domain (OOD) detection, zero-shot learning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.