Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning

Nishino, Toru; Misawa, Shotaro; Kano, Ryuji; Taniguchi, Takao; Miura, Yasuhide; Ohkuma, Tomoko

doi:10.18653/v1/d19-1315

Cited by 11 publications

(16 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[62] adds sentence-level sentiment classification and attention-level supervision to assist the primary stance detection task. [85] adds attention-level supervision to improve consistency of the two primary language generation tasks. [16] minimizes an auxiliary cosine softmax loss based on the audio encoder to learn more accurate speech-to-semantic mappings.…”

Section: Vanillamentioning

confidence: 99%

“…Learning from multiple tasks makes it possible for learning models to capture generalized and complementary knowledge from the tasks at hand besides task-specific features. Tasks in MTL can be tasks with assumed relatedness [20,23,40,56,121], tasks with different styles of supervision (e.g., supervised and unsupervised tasks [41,64,73]), tasks with different types of goals (e.g., classification and generation [85]), tasks with different levels of features (e.g., token-level and sentence-level features [57,109]), and even tasks in different modalities (e.g., text and image data [66,115]). Alternatively, we can treat the same task in multiple hierarchical architecture models the hierarchical relationships between tasks.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Multi-Task Learning in Natural Language Processing: An Overview

Chen¹,

Qiang²

2021

Preprint

View full text Add to dashboard Cite

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP). However, deep neural models often suffer from overfitting and data scarcity problems that are pervasive in NLP tasks. In recent years, Multi-Task Learning (MTL), which can leverage useful information of related tasks to achieve simultaneous performance improvement on multiple related tasks, has been used to handle these problems. In this paper, we give an overview of the use of MTL in NLP tasks. We first review MTL architectures used in NLP tasks and categorize them into four classes, including the parallel architecture, hierarchical architecture, modular architecture, and generative adversarial architecture. Then we present optimization techniques on loss construction, data sampling, and task scheduling to properly train a multi-task model. After presenting applications of MTL in a variety of NLP tasks, we introduce some benchmark datasets. Finally, we make a conclusion and discuss several possible research directions in this field.

show abstract

Section: Vanillamentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Multi-Task Learning in Natural Language Processing: An Overview

Chen¹,

Qiang²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…3. To maintain uniformity between the attention weights of different tasks, we utilise consistency loss (Nishino et al, 2019) in addition to the original task-specific losses.…”

Section: Happymentioning

confidence: 99%

“…We use the "consistency loss" (Nishino et al, 2019) to reduce the difference between the attention weights from different tasks. Attention agreement favours emotional words while decoding the responses.…”

Section: Consistency Lossmentioning

confidence: 99%

Modelling Context Emotions using Multi-task Learning for Emotion Controlled Dialog Generation

Varshney

Ekbal

Bhattacharyya

2021

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

View full text Add to dashboard Cite

A recent topic of research in natural language generation has been the development of automatic response generation modules that can automatically respond to a user's utterance in an empathetic manner. Previous research has tackled this task using neural generative methods by augmenting emotion classes with the input sequences. However, the outputs by these models may be inconsistent. We employ multitask learning to predict the emotion label and to generate a viable response for a given utterance using a common encoder with multiple decoders. Our proposed encoder-decoder model consists of a self-attention based encoder and a decoder with dot product attention mechanism to generate response with a specified emotion. We use the focal loss to handle imbalanced data distribution, and utilize the consistency loss to allow coherent decoding by the decoders. Human evaluation reveals that our model produces more emotionally pertinent responses. In addition, our model outperforms multiple strong baselines on automatic evaluation measures such as F1 and BLEU scores, thus resulting in more fluent and adequate responses.

show abstract

“…Reason being that several relational facts may overlap in a sentence (Zhang et al, 2018). Although a conventional MTL method may learn task-specific features and has been successfully applied in a wide variety of scenarios (Zhang and Wang, 2016;Wu et al, 2016;Goo et al, 2018;Han et al, 2019;Nishino et al, 2019;Hu et al, 2019), its flat structure restricts the model to effectively learn the correlations between tasks. For example in Figure 1(a), the model cannot explicitly learn correlations between the two tasks.…”

Section: Introductionmentioning

confidence: 99%

Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations

Sun¹,

Zhang²,

Mensah³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

The idea of using multi-task learning approaches to address the joint extraction of entity and relation is motivated by the relatedness between the entity recognition task and the relation classification task. Existing methods using multi-task learning techniques to address the problem learn interactions among the two tasks through a shared network, where the shared information is passed into the taskspecific networks for prediction. However, such an approach hinders the model from learning explicit interactions between the two tasks to improve the performance on the individual tasks. As a solution, we design a multitask learning model which we refer to as recurrent interaction network which allows the learning of interactions dynamically, to effectively model task-specific features for classification. Empirical studies on two real-world datasets confirm the superiority of the proposed model.

show abstract

Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning

Cited by 11 publications

References 25 publications

Multi-Task Learning in Natural Language Processing: An Overview

Multi-Task Learning in Natural Language Processing: An Overview

Modelling Context Emotions using Multi-task Learning for Emotion Controlled Dialog Generation

Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations

Contact Info

Product

Resources

About