Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1367
|View full text |Cite
|
Sign up to set email alerts
|

Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition

Abstract: In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. In particular, we improve differentiable architecture search by removing the softmax-local constraint. Also, we apply differentiable NAS to named entity recognition (NER). It is the first time that differentiable NAS methods are adopted in NLP tasks other than language modeling. On both the PTB language modeling and CoNLL-2003 English NER data, our method outperforms strong baselines. It achieves a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
38
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 79 publications
(39 citation statements)
references
References 12 publications
0
38
0
1
Order By: Relevance
“…The recent work in (Liu et al, 2019b) employs a collaborative memory network to further model the semantic correlations among words, slots and intents jointly. For NER, recent works use explicit architecture to incorporate information such as global context (Liu et al, 2019a) or conduct optimal architecture searches (Jiang et al, 2019). The best performing models have been using pre-training models on large corpus (Baevski et al, 2019) or incorporating fine-tuning on existing pre-trained models (Liu et al, 2019a) such as BERT (Devlin et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…The recent work in (Liu et al, 2019b) employs a collaborative memory network to further model the semantic correlations among words, slots and intents jointly. For NER, recent works use explicit architecture to incorporate information such as global context (Liu et al, 2019a) or conduct optimal architecture searches (Jiang et al, 2019). The best performing models have been using pre-training models on large corpus (Baevski et al, 2019) or incorporating fine-tuning on existing pre-trained models (Liu et al, 2019a) such as BERT (Devlin et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…NAS methods have shown strong performance on many NLP and CV tasks, such as language model-ing and image classification (Zoph and Le, 2017;Pham et al, 2018;Luo et al, 2018;Liu et al, 2019). Applications in NLP, such as NER (Jiang et al, 2019;Li et al, 2020), translation (So et al, 2019), text classification (Wang et al, 2020), and natural language inference (NLI) (Pasunuru and Bansal, 2019;Wang et al, 2020) have also been explored.…”
Section: Related Workmentioning
confidence: 99%
“…Current SOTA approaches focus on learning new cell architectures as replacements for LSTM or convolutional cells (Zoph and Le, 2017;Pham et al, 2018;Liu et al, 2019;Jiang et al, 2019;Li et al, 2020) or entire model architectures to replace hand-designed models such as the transformer or DenseNet (So et al, 2019;Pham et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…In the general domain, NER was first defined to identify personal names, organizations, and locations (Chinchor and Robinson, 1997), to then be extended to a variety of entities depending on the particular application. Nowadays, the best results for the original 2003 NER task (Sang and De Meulder, 2003) are self-attention networks (Baevski et al, 2019), differentiable neural architecture search methods (Jiang et al, 2019), and LSTM-CRF enriched with ELMo, BERT, and Flair contextual embeddings (Straková et al, 2019).…”
Section: Introductionmentioning
confidence: 99%