FAQ Retrieval using Query-Question Similarity and BERT-Based Query-Answer Relevance

Sakata, Wataru; Shibata, Tomohide; Tanaka, Ribeka; Kurohashi, Sadao

doi:10.48550/arxiv.1905.02851

Cited by 2 publications

(1 citation statement)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Transfer learning via large pre-trained Transformers [46]-the prominent case being BERT [7]-has lead to remarkable empirical successes on a range of NLP problems. The BERT approach to learn textual representations has also significantly improved the performance of neural models for several IR tasks [55,54,37,33,57], that for a long time struggled to outperform classic IR models [53]. In this work we use the no-CL BERT as a strong baseline for the conversation response ranking task.…”

Section: Related Workmentioning

confidence: 99%

Curriculum Learning Strategies for IR: An Empirical Study on Conversation Response Ranking

Penha¹,

Hauff²

2019

Preprint

View full text Add to dashboard Cite

Neural ranking models are traditionally trained on a series of random batches, sampled uniformly from the entire training set. Curriculum learning has recently been shown to improve neural models' effectiveness by sampling batches non-uniformly, going from easy to difficult instances during training. In the context of neural Information Retrieval (IR) curriculum learning has not been explored yet, and so it remains unclear (1) how to measure the difficulty of training instances and (2) how to transition from easy to difficult instances during training. To address both challenges and determine whether curriculum learning is beneficial for neural ranking models, we need large-scale datasets and a retrieval task that allows us to conduct a wide range of experiments. For this purpose, we resort to the task of conversation response ranking: ranking responses given the conversation history. In order to deal with challenge (1), we explore scoring functions to measure the difficulty of conversations based on different input spaces. To address challenge (2) we evaluate different pacing functions, which determine the velocity in which we go from easy to difficult instances. We find that, overall, by just intelligently sorting the training data (i.e., by performing curriculum learning) we can improve the retrieval effectiveness by up to 2% 1 .

show abstract

Section: Related Workmentioning

confidence: 99%