Bilateral Multi-Perspective Matching for Natural Language Sentences

Wang, Zhiguo; Hamza, Wael; Florian, Radu

doi:10.24963/ijcai.2017/579

Cited by 581 publications

(498 citation statements)

References 0 publications

Supporting

Mentioning

494

Contrasting

Unclassified

Order By: Relevance

“…Lei et al (2016) consider a related task leveraging the AskUbuntu corpus (dos Santos et al, 2015), but it contains two orders of magnitude less annotations, thus limiting the quality of any model. Most relevant to this work is that of Wang et al (2017), who present the best results on the Quora dataset prior to this work. The bilateral multi-perspective matching model (BIMPM) of Wang et al uses a character-based LSTM (Hochreiter and Schmidhuber, 1997) at its input representation layer, a layer of bi-LSTMs for computing context information, four different types of multi-perspective matching layers, an additional bi-LSTM aggregation layer, followed by a two-layer feedforward network for prediction.…”

Section: Related Workmentioning

confidence: 99%

Neural Paraphrase Identification of Questions with Noisy Pretraining

Tomar¹,

Duque²,

Täckström³

et al. 2017

Proceedings of the First Workshop on Subword and Character Level Models in NLP

View full text Add to dashboard Cite

We present a solution to the problem of paraphrase identification of questions. We focus on a recent dataset of question pairs annotated with binary paraphrase labels and show that a variant of the decomposable attention model (Parikh et al., 2016) results in accurate performance on this task, while being far simpler than many competing neural architectures. Furthermore, when the model is pretrained on a noisy dataset of automatically collected question paraphrases, it obtains the best reported performance on the dataset.

show abstract

Section: Related Workmentioning

confidence: 99%

Neural Paraphrase Identification of Questions with Noisy Pretraining

Tomar¹,

Duque²,

Täckström³

et al. 2017

Proceedings of the First Workshop on Subword and Character Level Models in NLP

View full text Add to dashboard Cite

show abstract

“…• (Wang et al, 2016a) 0.734 0.742 BiMPM (Wang et al, 2017) 0 thogonal decomposition (OD) strategy has a superior performance to direct (DI) strategy on all datasets. The comparison results are posted in Table 1.…”

Section: Resultsmentioning

confidence: 99%

“…Model r ρ MSE Meaning Factory (Jiménez et al, 2014) 0.8268 0.7721 0.3224 ECNU (Zhao et al, 2014) 0.8414 --BiLSTM (Tai et al, 2015) 0.8567 0.7966 0.2736 Tree-LSTM (Tai et al, 2015) 0.8676 0.8083 0.2532 MPCNN (He et al, 2015) 0 Wang and Ittycheriah (2015) 0.746 0.820 QA-LSTM (Tan et al, 2015) 0.728 0.832 Att-pooling (dos Santos et al, 2016) 0.753 0.851 LDC (Wang et al, 2016b) 0.771 0.845 MPCNN (He et al, 2015) 0.777 0.836 PWIM 0.738 0.827 NCE-CNN (Rao et al, 2016) 0.801 0.877 BiMPM (Wang et al, 2017) 0.802 0.875 IWAN-att (Proposed) 0.822 0.889 IWAN-skip (Proposed) 0.801 0.861 Table 3: Test results on Clean version TrecQA.…”

Section: Training Detailsmentioning

confidence: 99%

“…combines attention mechanism with tree-structured RecNN encoder. Some prior works (Wang et al, 2016b;Parikh et al, 2016;Wang et al, 2017) compute softalignment representation for each word in sentences attentively with word level similarity and then compose the alignment representations to determine the relation. Our model is also under this framework however we focus on explicitly calculating weights for each word to get more reasonable semantic composition.…”

Section: Attentive Modelsmentioning

confidence: 99%

See 1 more Smart Citation

Inter-Weighted Alignment Network for Sentence Pair Modeling

Shen¹,

Yang²,

Deng³

2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

104

110

View full text Add to dashboard Cite

Sentence pair modeling is a crucial problem in the field of natural language processing. In this paper, we propose a model to measure the similarity of a sentence pair focusing on the interaction information. We utilize the word level similarity matrix to discover fine-grained alignment of two sentences. It should be emphasized that each word in a sentence has a different importance from the perspective of semantic composition, so we exploit two novel and efficient strategies to explicitly calculate a weight for each word. Although the proposed model only use a sequential LSTM for sentence modeling without any external resource such as syntactic parser tree and additional lexicon features, experimental results show that our model achieves state-of-the-art performance on three datasets of two tasks.

show abstract

“…Baselines. We compare the performance of our models with that of the state-of-the-art models on the clean version of the TREC-QA dataset (Shen et al, 2017;Bian et al, 2017;Wang et al, 2017;Rao et al, 2016;Tay et al, 2017). We do not have access to the original implementation of IWAN, hence we use our implementation of the IWAN model as the basis for our models.…”

Section: Contextual Language Modelmentioning

confidence: 99%

The Context-Dependent Additive Recurrent Neural Net

Tran¹,

Lai²,

Haffari³

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

Contextual sequence mapping is one of the fundamental problems in Natural Language Processing. Instead of relying solely on the information presented in a text, the learning agents have access to a strong external signal given to assist the learning process. In this paper, we propose a novel family of Recurrent Neural Network unit: the Context-dependent Additive Recurrent Neural Network (CARNN) that is designed specifically to leverage this external signal. The experimental results on public datasets in the dialog problem (Babi dialog Task 6 and Frame), contextual language model (Switchboard and Penn Discourse Tree Bank) and question answering (TrecQA) show that our novel CARNN-based architectures outperform previous methods.

show abstract

Bilateral Multi-Perspective Matching for Natural Language Sentences

Cited by 581 publications

References 0 publications

Neural Paraphrase Identification of Questions with Noisy Pretraining

Neural Paraphrase Identification of Questions with Noisy Pretraining

Inter-Weighted Alignment Network for Sentence Pair Modeling

The Context-Dependent Additive Recurrent Neural Net

Contact Info

Product

Resources

About