Transfer Learning for Sequence Labeling Using Source Model and Target Data

Chen, Lingzhen; Moschitti, Alessandro

doi:10.1609/aaai.v33i01.33016260

Cited by 34 publications

(16 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Neural Network-based Models for NER Some researchers design different architectures which vary in word encoder (Chiu and Nichols 2016;Ma and Hovy 2016), sentence encoder (Huang, Xu, and Yu 2015;Ma and Hovy 2016;Chiu and Nichols 2016) and decoder (CRF) (Huang, Xu, and Yu 2015). Some works explore how to transfer learned parameters from the source domain to a new domain (Chen and Moschitti 2019;Lin and Lu 2018;Cao et al 2018). Recently, (Yang, Liang, and Zhang 2018;Reimers and Gurevych 2017) systematically analyze neural NER models to provide useful guidelines for NLP practitioners.…”

Section: Related Workmentioning

confidence: 99%

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

Liu

Zhang

2020

AAAI

View full text Add to dashboard Cite

While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/.

show abstract

Section: Related Workmentioning

confidence: 99%

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

Liu

Zhang

2020

AAAI

View full text Add to dashboard Cite

show abstract

“…Reference [10] focuses on transfer learning from English to Japanese, proposing the method called romanization to help dissimilar languages share a common character embedding space. Other approaches include [22], which encodes slot description to vectors and employs an attention layer to obtain slot-aware representations of user input, and [23], which uses features derived from the source model.…”

Section: B Transfer Learningmentioning

confidence: 99%

Multi-Level Cross-Lingual Transfer Learning With Language Shared and Specific Knowledge for Spoken Language Understanding

Wang

Yan

2020

IEEE Access

View full text Add to dashboard Cite

Recently conversational agents effectively improve their understanding capabilities by neural networks. Such deep neural models, however, do not apply to most human languages due to the lack of annotated training data for various NLP tasks. In this paper, we propose a multi-level cross-lingual transfer model with language shared and specific knowledge to improve the spoken language understanding of lowresource languages. Our method explicitly separates the model into the language-shared part and languagespecific part to transfer cross-lingual knowledge and improve the monolingual slot tagging, especially for low-resource languages. To refine the shared knowledge, we add a language discriminator and employ adversarial training to reinforce information separation. Besides, we adopt novel multi-level knowledge transfer in an incremental and progressive way to acquire multi-granularity shared knowledge rather than a single layer. To mitigate the discrepancies between the feature distributions of language specific and shared knowledge, we propose the neural adapters to fuse knowledge automatically. Experiments show that our proposed model consistently outperforms monolingual baseline with a statistically significant margin up to 2.09%, even higher improvement of 12.21% in the zero-shot setting.INDEX TERMS Spoken language understanding, cross-lingual learning, linguistic knowledge transfer, adversarial learning, multi-level knowledge representation.

show abstract

“…Then, a softmax function is used to yield the conditional probability, wherẽ Y denotes one of all possible label sequences (paths). ψ(X, Y ) is defined as the sum of emission scores (or state scores) and transition scores over all time steps (Morris and Fosler-Lussier, 2006;Chen and Moschitti, 2019):…”

Section: Modelmentioning

confidence: 99%

Speaker-change Aware CRF for Dialogue Act Classification

Shang¹,

Tixier²,

Vazirgiannis³

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

Recent work in Dialogue Act (DA) classification approaches the task as a sequence labeling problem, using neural network models coupled with a Conditional Random Field (CRF) as the last layer. CRF models the conditional probability of the target DA label sequence given the input utterance sequence. However, the task involves another important input sequence, that of speakers, which is ignored by previous work. To address this limitation, this paper proposes a simple modification of the CRF layer that takes speaker-change into account. Experiments on the SwDA corpus show that our modified CRF layer outperforms the original one, with very wide margins for some DA labels. Further, visualizations demonstrate that our CRF layer can learn meaningful, sophisticated transition patterns between DA label pairs conditioned on speaker-change in an end-to-end way. Code is publicly available 1 .

show abstract

Transfer Learning for Sequence Labeling Using Source Model and Target Data

Cited by 34 publications

References 5 publications

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

Multi-Level Cross-Lingual Transfer Learning With Language Shared and Specific Knowledge for Spoken Language Understanding

Speaker-change Aware CRF for Dialogue Act Classification

Contact Info

Product

Resources

About