A Simultaneous Recognition Framework for the Spoken Language Understanding Module of Intelligent Personal Assistant Software on Smart Phones

Lee, Changsu; Seo, Jungyun

doi:10.3115/v1/p15-2134

Cited by 4 publications

(2 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our everyday lives have become more reliant on virtual assistants, which provide efficiency and convenience for a variety of activities, from setting notifications and handling calendars to answering inquiries and managing smart home devices [1]. The capacity of such virtual assistants to understand and interpret user input in natural language is essential to their effectiveness.…”

Section: Introductionmentioning

confidence: 99%

Incorporating Natural Language Processing into Virtual Assistants: An Intelligent Assessment Strategy for Enhancing Language Comprehension

Antonius,

Alapati,

Ritonga

et al. 2023

IJACSA

View full text Add to dashboard Cite

The study introduces a comprehensive technique for enhancing the Natural Language Processing (NLP) capabilities of virtual assistant systems. The method addresses the challenges of efficient information transfer and optimizing model size while ensuring improved performance, with a primary focus on model pertaining and distillation. To tackle the issue of vocabulary size affecting model performance, the study employs the SentencePiece tokenizer with unigram settings. This approach allows for the creation of a well-balanced vocabulary, which is essential for striking the right balance between task performance and resource efficiency. a novel pre-layernorm design is introduced, drawing inspiration from models like BERT and RoBERTa. This optimization optimizes the placement of layer normalization within transformer layers during the pretraining phase. Teacher models are effectively trained using masked language modeling objectives and the Deepspeed scaling framework. Modifications to model operations are made, and mixed precision training strategies are explored to ensure stability. The two-stage distillation method efficiently transfers knowledge from teacher models to student models. It begins with an intermediate model, and the data is distilled carefully using logit and hidden layer matching techniques. This information transfer significantly enhances the final student model while maintaining an ideal model size for low-latency applications. In this approach, innovative measurements, such as the precision of filling a mask, are employed to assess the effectiveness and quality of the methods. The findings demonstrate substantial improvements over publicly available models, showcasing the effectiveness of the strategy within complete virtual assistant systems. The proposed approach confirms the potential of the technique to enhance language comprehension and efficiency within virtual assistants, specifically addressing the challenges posed by real-world user inputs. Through extensive testing and rigorous analysis, the capability of the method to meet these objectives is validated.

show abstract

Section: Introductionmentioning

confidence: 99%

Incorporating Natural Language Processing into Virtual Assistants: An Intelligent Assessment Strategy for Enhancing Language Comprehension

Antonius,

Alapati,

Ritonga

et al. 2023

IJACSA

View full text Add to dashboard Cite

show abstract

“…In order to break these limitations, many researchers have proposed models to deal with the subtasks mentioned above jointly. Some of them jointly modeled subtasks in NLU [3,4], some of them jointly modeled subtasks from NLU to ST [5]. Although there are few successful cases on jointly modeling NLU, ST and AS, similar ideas are already applied in computer games.…”

mentioning

confidence: 99%

Cascaded LSTMs Based Deep Reinforcement Learning for Goal-Driven Dialogue

Wang

Dong

et al. 2018

Natural Language Processing and Chinese Computing

View full text Add to dashboard Cite

This paper proposes a deep neural network model for jointly modeling Natural Language Understanding and Dialogue Management in goal-driven dialogue systems. There are three parts in this model. A Long Short-Term Memory (LSTM) at the bottom of the network encodes utterances in each dialogue turn into a turn embedding. Dialogue embeddings are learned by a LSTM at the middle of the network, and updated by the feeding of all turn embeddings. The top part is a forward Deep Neural Network which converts dialogue embeddings into the Q-values of different dialogue actions. The cascaded LSTMs based reinforcement learning network is jointly optimized by making use of the rewards received at each dialogue turn as the only supervision information. There is no explicit NLU and dialogue states in the network. Experimental results show that our model outperforms both traditional Markov Decision Process (MDP) model and single LSTM with Deep Q-Network on meeting room booking tasks. Visualization of dialogue embeddings illustrates that the model can learn the representation of dialogue states.

show abstract