Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition

Jiang, Yufan; Hu, Chi; Xiao, Tong; Zhang, Chunliang; Zhu, Jun

doi:10.18653/v1/d19-1367

Cited by 79 publications

(39 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The recent work in (Liu et al, 2019b) employs a collaborative memory network to further model the semantic correlations among words, slots and intents jointly. For NER, recent works use explicit architecture to incorporate information such as global context (Liu et al, 2019a) or conduct optimal architecture searches (Jiang et al, 2019). The best performing models have been using pre-training models on large corpus (Baevski et al, 2019) or incorporating fine-tuning on existing pre-trained models (Liu et al, 2019a) such as BERT (Devlin et al, 2018).…”

Section: Related Workmentioning

confidence: 99%

Handling Rare Entities for Neural Sequence Labeling

Li²,

Yao

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

One great challenge in neural sequence labeling is the data sparsity problem for rare entity words and phrases. Most of test set entities appear only few times and are even unseen in training corpus, yielding large number of out-of-vocabulary (OOV) and low-frequency (LF) entities during evaluation. In this work, we propose approaches to address this problem. For OOV entities, we introduce local context reconstruction to implicitly incorporate contextual information into their representations. For LF entities, we present delexicalized entity identification to explicitly extract their frequency-agnostic and entity-typespecific representations. Extensive experiments on multiple benchmark datasets show that our model has significantly outperformed all previous methods and achieved new startof-the-art results. Notably, our methods surpass the model fine-tuned on pre-trained language models without external resource.

show abstract

Section: Related Workmentioning

confidence: 99%

Handling Rare Entities for Neural Sequence Labeling

Li²,

Yao

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…NAS methods have shown strong performance on many NLP and CV tasks, such as language model-ing and image classification (Zoph and Le, 2017;Pham et al, 2018;Luo et al, 2018;Liu et al, 2019). Applications in NLP, such as NER (Jiang et al, 2019;Li et al, 2020), translation (So et al, 2019), text classification (Wang et al, 2020), and natural language inference (NLI) (Pasunuru and Bansal, 2019;Wang et al, 2020) have also been explored.…”

Section: Related Workmentioning

confidence: 99%

“…Current SOTA approaches focus on learning new cell architectures as replacements for LSTM or convolutional cells (Zoph and Le, 2017;Pham et al, 2018;Liu et al, 2019;Jiang et al, 2019;Li et al, 2020) or entire model architectures to replace hand-designed models such as the transformer or DenseNet (So et al, 2019;Pham et al, 2018).…”

Section: Related Workmentioning

confidence: 99%

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

MacLaughlin¹,

Dhamala²,

Kumar³

et al. 2020

Proceedings of the First Workshop on Insights From Negative Results in NLP

View full text Add to dashboard Cite

Neural Architecture Search (NAS) methods, which automatically learn entire neural model or individual neural cell architectures, have recently achieved competitive or state-of-theart (SOTA) performance on variety of natural language processing and computer vision tasks, including language modeling, natural language inference, and image classification. In this work, we explore the applicability of a SOTA NAS algorithm, Efficient Neural Architecture Search (ENAS) (Pham et al., 2018) to two sentence pair tasks, paraphrase detection and semantic textual similarity. We use ENAS to perform a microlevel search and learn a task-optimized RNN cell architecture as a drop-in replacement for an LSTM. We explore the effectiveness of ENAS through experiments on three datasets (MRPC, SICK, STS-B), with two different models (ESIM, BiLSTM-Max), and two sets of embeddings (Glove, BERT). In contrast to prior work applying ENAS to NLP tasks, our results are mixed -we find that ENAS architectures sometimes, but not always, outperform LSTMs and perform similarly to random architecture search.

show abstract

“…In the general domain, NER was first defined to identify personal names, organizations, and locations (Chinchor and Robinson, 1997), to then be extended to a variety of entities depending on the particular application. Nowadays, the best results for the original 2003 NER task (Sang and De Meulder, 2003) are self-attention networks (Baevski et al, 2019), differentiable neural architecture search methods (Jiang et al, 2019), and LSTM-CRF enriched with ELMo, BERT, and Flair contextual embeddings (Straková et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

The Chilean Waiting List Corpus: a new resource for clinical Named Entity Recognition in Spanish

Báez

Villena

Rojas³

et al. 2020

Proceedings of the 3rd Clinical Natural Language Processing Workshop

View full text Add to dashboard Cite

In this work we describe the Waiting List Corpus consisting of de-identified referrals for several specialty consultations from the waiting list in Chilean public hospitals. A subset of 900 referrals was manually annotated with 9,029 entities, 385 attributes, and 284 pairs of relations with clinical relevance. A trained medical doctor annotated these referrals, and then together with other three researchers, consolidated each of the annotations. The annotated corpus has nested entities, with 32.2% of entities embedded in other entities. We use this annotated corpus to obtain preliminary results for Named Entity Recognition (NER). The best results were achieved by using a biLSTM-CRF architecture using word embeddings trained over Spanish Wikipedia together with clinical embeddings computed by the group. NER models applied to this corpus can leverage statistics of diseases and pending procedures within this waiting list. This work constitutes the first annotated corpus using clinical narratives from Chile, and one of the few for the Spanish language. The annotated corpus, the clinical word embeddings, and the annotation guidelines are freely released to the research community.

show abstract

Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition

Cited by 79 publications

References 12 publications

Handling Rare Entities for Neural Sequence Labeling

Handling Rare Entities for Neural Sequence Labeling

Evaluating the Effectiveness of Efficient Neural Architecture Search for Sentence-Pair Tasks

The Chilean Waiting List Corpus: a new resource for clinical Named Entity Recognition in Spanish

Contact Info

Product

Resources

About