A Gated Self-attention Memory Network for Answer Selection

Lai, Tuan Manh; Tran, Quan Hung; Bui, Trung; Kihara, Daisuke

doi:10.18653/v1/d19-1610

Cited by 30 publications

(34 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The third block presents the results of models that use both pre-trained language models and transfer learning. In particular, Yoon et al (2018) use ELMo and transfer learning on the QNLI dataset; Lai et al (2019) use BERT and perform transfer learning on the QNLI dataset, and Garg et al (2019) use RoBERTa large and perform transfer learning from the Natural Question dataset. We note that the MAP of efficient models ranges between 71% to 75%, while the MAP of expensive models ranges between 83% to 92%.…”

Section: State-of-the-art Resultsmentioning

confidence: 99%

“…Despite obtaining better results than previous approaches, the computational cost of performing word-level attention and the aggregation steps to leverage the information extracted by the attention mechanism increases the computational cost of previous methods. More recent models, e.g., (Lai et al, 2019;Garg et al, 2019;Yoon et al, 2018), leverage contextualized word representation, e.g., pre-trained using BERT, ELMo, RoBERTa, etc. These approaches achieve state-of-the-art results for AS2, but they require significant computational power for both pre-training, fine-tuning, and testing on the final task.…”

Section: Related Workmentioning

confidence: 99%

“…Therefore, we used the official development set as our test set and a portion of the training set for validation. (Sha et al, 2018) 74.62 75.76 (Bian et al, 2017) 75.40 76.40 Related Word with pre-training (Yoon et al, 2018) 83.40 84.80 (Lai et al, 2019) 85.70 87.20 (Garg et al, 2019) 92.00 93.30…”

Section: Datasetsmentioning

confidence: 99%

See 2 more Smart Citations

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

Bonadiman

Moschitti

2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

An essential task of most Question Answering (QA) systems is to re-rank the set of answer candidates, i.e., Answer Sentence Selection (AS2). These candidates are typically sentences either extracted from one or more documents preserving their natural order or retrieved by a search engine. Most state-of-the-art approaches to the task use huge neural models, such as BERT, or complex attentive architectures. In this paper, we argue that by exploiting the intrinsic structure of the original rank together with an effective word-relatedness encoder, we achieve the highest accuracy among the cost-efficient models, with two orders of magnitude fewer parameters than the current state of the art. Our model takes 9.5 seconds to train on the WikiQA dataset, i.e., very fast in comparison with the ∼ 18 minutes required by a standard BERT-base fine-tuning.

show abstract

Section: State-of-the-art Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

Bonadiman

Moschitti

2020

Proceedings of the 28th International Conference on Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…It is an active research problem with applications in many areas (Tay et al, 2018a;Tayyar Madabushi et al, 2018;Rao et al, 2019;Lai et al, 2020). Similar to most recent papers on this topic (Tay et al, 2018b;Lai et al, 2019;Garg et al, 2020), we cast the question answering problem as a binary classification problem by concatenating the question with each of the candidate answers and assigning positive label to the concatenation containing the correct answer.…”

Section: Proposed Frameworkmentioning

confidence: 99%

“…To update the memory, however, we first need an indexing mechanism for writing. Instead of using the original indexing of the NTM, we adopt the simpler indexing procedure from the memory network, which has been proven to be useful in this task (Lai et al, 2019). At time step t, for each incoming data point x t , we compute the attention weight w e t i for the support vector e t i :…”

Section: Proposed Frameworkmentioning

confidence: 99%

Explain by Evidence: An Explainable Memory-based Neural Network for Question Answering

Tran

Dam

Lai

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Interpretability and explainability of deep neural networks are challenging due to their scale, complexity, and the agreeable notions on which the explaining process rests. Previous work, in particular, has focused on representing internal components of neural networks through humanfriendly visuals and concepts. On the other hand, in real life, when making a decision, human tends to rely on similar situations and/or associations in the past. Hence arguably, a promising approach to make the model transparent is to design it in a way such that the model explicitly connects the current sample with the seen ones, and bases its decision on these samples. Grounded on that principle, we propose in this paper an explainable, evidence-based memory network architecture, which learns to summarize the dataset and extract supporting evidences to make its decision. Our model achieves state-of-the-art performance on two popular question answering datasets (i.e. TrecQA and WikiQA). Via further analysis, we show that this model can reliably trace the errors it has made in the validation step to the training instances that might have caused these errors. We believe that this error-tracing capability provides significant benefit in improving dataset quality in many applications.

show abstract

Text‐based question answering from information retrieval and deep neural network perspectives: A survey

Abbasiantaeb

Momtazi

2021

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Text‐based question answering (QA) is a challenging task which aims at finding short concrete answers for users' questions. This line of research has been widely studied with information retrieval (IR) techniques and has received increasing attention in recent years by considering deep neural network approaches. Deep learning (DL) approaches, which are the main focus of this paper, provide a powerful technique to learn multiple layers of representations and interaction between the questions and the answer sentences. In this paper, we provide a comprehensive overview of different models proposed for the QA task, including both a traditional IR perspective and a more recent deep neural network environment. We also introduce well‐known datasets for the task and present available results from the literature to have a comparison between different techniques. This article is categorized under: Algorithmic Development > Text Mining Technologies > Machine Learning

show abstract

A Gated Self-attention Memory Network for Answer Selection

Cited by 30 publications

References 19 publications

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

Explain by Evidence: An Explainable Memory-based Neural Network for Question Answering

Text‐based question answering from information retrieval and deep neural network perspectives: A survey

Contact Info

Product

Resources

About