Efficient Intent Detection with Dual Sentence Encoders

Casanueva, Iñigo; Temčinas, Tadas; Gerz, Daniela; Henderson, Matthew; Vulić, Ivan

doi:10.18653/v1/2020.nlp4convai-1.5

Cited by 207 publications

(186 citation statements)

References 27 publications

Supporting

Mentioning

183

Contrasting

Order By: Relevance

“…Note that, besides quicker pretraining, intent classifiers based on ConveRT encodings train 40 times faster than BERT-LARGE-based ones, as only the classification layers are trained for ConveRT. Additional experiments related to efficiency of intent classification have been conducted by Casanueva et al (2020). In sum, these preliminary results suggest that ConveRT as a sentence encoder can be useful beyond the core response selection task.…”

Section: Bankingmentioning

confidence: 86%

“…In sum, these preliminary results suggest that ConveRT as a sentence encoder can be useful beyond the core response selection task. The usefulness of ConveRT-based sentence representations have been recently confirmed on other intent classification datasets (Casanueva et al, 2020), with different intent classifiers (Bunk et al, 2020), and in another dialog task: turn-based value extraction (Coope et al, 2020;Bunk et al, 2020;Mehri et al, 2020). In future work, we plan to investigate other possible applications of transfer, especially for the challenging low-data setups.…”

Section: Bankingmentioning

confidence: 87%

“…This notable reduction in size and training acceleration are achieved through combining 8-bit embedding quantization and quantization-aware training, subword-level parameterization, and pruned self-attention. Furthermore, the lightweight design allows us to reserve additional parameters to improve the expressiveness of the dual-encoder architecture; this leads to improved learning of conversational representations that can be transferred to other dialog tasks such as intent detection and slot filling, as already demonstrated by recent work (Casanueva et al, 2020;Bunk et al, 2020;Coope et al, 2020).…”

Section: Scalability and Portabilitymentioning

confidence: 94%

“…Therefore, we also probe the usefulness of ConveRT encodings for transfer learning in the intent classification task: the model must classify the user's utterance into one of several predefined classes, that is, intents (e.g., within e-banking intents can be card lost or replace card). We use BANKING77 (Casanueva et al, 2020) diverse domains, see Table 1, divided into train, dev and test sets using a 80/10/10 split. We use the pretrained ConveRT encodings r x on the input side (see Figure 1) as input to an intent classification model.…”

Section: Methodsmentioning

confidence: 99%

See 3 more Smart Citations

ConveRT: Efficient and Accurate Conversational Representations from Transformers

Henderson¹,

Casanueva²,

Mrkšić³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

Self Cite

116

105

View full text Add to dashboard Cite

General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train. We propose ConveRT (Conversational Representations from Transformers), a pretraining framework for conversational tasks satisfying all the following requirements: it is effective, affordable, and quick to train. We pretrain using a retrieval-based response selection task, effectively leveraging quantization and subword-level parameterization in the dual encoder to build a lightweight memoryand energy-efficient model. We show that Con-veRT achieves state-of-the-art performance across widely established response selection tasks. We also demonstrate that the use of extended dialog history as context yields further performance gains. Finally, we show that pretrained representations from the proposed encoder can be transferred to the intent classification task, yielding strong results across three diverse data sets. ConveRT trains substantially faster than standard sentence encoders or previous state-of-the-art dual encoders. With its reduced size and superior performance, we believe this model promises wider portability and scalability for Conversational AI applications.

show abstract

Section: Bankingmentioning

confidence: 86%

Section: Bankingmentioning

confidence: 87%

Section: Scalability and Portabilitymentioning

confidence: 94%

Section: Methodsmentioning

confidence: 99%

See 2 more Smart Citations

ConveRT: Efficient and Accurate Conversational Representations from Transformers

Henderson¹,

Casanueva²,

Mrkšić³

et al. 2020

Findings of the Association for Computational Linguistics: EMNLP 2020

Self Cite

116

105

View full text Add to dashboard Cite

show abstract

“…Notice that the intent classifier is typically implemented using standard text classification algorithms (Weiss et al, 2012;Larson et al, 2019;Casanueva et al, 2020). Consequently, to perform OOS sample detection, methods often rely on one-class classification or threshold rejectionbased techniques using the probability outputs for each class (Larson et al, 2019) or reconstruction errors (Ryu et al, 2017(Ryu et al, , 2018.…”

Section: Introductionmentioning

confidence: 99%

Improving Out-of-Scope Detection in Intent Classification by Using Embeddings of the Word Graph Space of the Classes

Cavalin¹,

Ribeiro²,

Appel³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

This paper explores how intent classification can be improved by representing the class labels not as a discrete set of symbols but as a space where the word graphs associated to each class are mapped using typical graph embedding techniques. The approach, inspired by a previous algorithm used for an inverse dictionary task, allows the classification algorithm to take in account inter-class similarities provided by the repeated occurrence of some words in the training examples of the different classes. The classification is carried out by mapping text embeddings to the word graph embeddings of the classes. Focusing solely on improving the representation of the class label set, we show in experiments conducted in both private and public intent classification datasets, that better detection of out-of-scope examples (OOS) is achieved and, as a consequence, that the overall accuracy of intent classification is also improved. In particular, using the recently-released Larson dataset, an error of about 9.9% has been achieved for OOS detection, beating the previous state-of-the-art result by more than 31 percentage points.

show abstract