The World Wide Web Conference 2019
DOI: 10.1145/3308558.3313502
|View full text |Cite
|
Sign up to set email alerts
|

Improved Cross-Lingual Question Retrieval for Community Question Answering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 33 publications
0
8
0
Order By: Relevance
“…They used those embeddings to translate query terms word by word into the document language. Rücklé et al (2019) trained NMT model for CLIR using out-domain data and synthetic data (created by translating in-domain monolingual English into German) to retrieve answers to German questions from English collection in the technical domain (AskUbuntu and StackOverflow).…”
Section: Related Workmentioning
confidence: 99%
“…They used those embeddings to translate query terms word by word into the document language. Rücklé et al (2019) trained NMT model for CLIR using out-domain data and synthetic data (created by translating in-domain monolingual English into German) to retrieve answers to German questions from English collection in the technical domain (AskUbuntu and StackOverflow).…”
Section: Related Workmentioning
confidence: 99%
“…This shows that our work can potentially benefit a much wider range of related tasks beyond duplicate question detection. For instance, future work could extend upon this by using our methods to obtain more training data in cross-lingual cQA setups (Joty et al, 2017;Rücklé et al, 2019b), or by combining them with other training strategies, e.g., using our methods for pre-training.…”
Section: Discussionmentioning
confidence: 99%
“…Most of them are collected from open-domain and comprise of comparable/parallel document pairs, e.g., WaCky translation dataset (Baroni et al 2009;Joulin et al 2018), benchmark CLEF corpora (Vulić and Moens 2015), the Askubuntu benchmark corpus in QA task (Dos Santos et al 2015;Barzilay et al 2016), Arabic-English language pairs (Da San Martino et al 2017), text stream alignment 2018) follow previous dataset construction method (Schamoni et al 2014) and collect 25 cross-lingual datasets with large scales based on Wikipedia. Rücklé et al (2019) extend open-domain crosslingual question retrieval to the task-oriented domain, i.e., constructing a dataset upon StackOverflow. To facilitate the development of cross-lingual information retrieval in crossborder e-Commerce, we, for the first time, create a highquality heuristic dataset from real commercial applications.…”
Section: Existing Datasetsmentioning
confidence: 99%
“…According to the translation direction, these systems are further categorized into translating the local language to the foreign one and translating the foreign language to the local one. By doing so, the CLIR task is converted to a monolingual setting through the machine translation system (Nie 2010;Rücklé, Swarnkar, and Gurevych 2019). Although these systems are conceptual simple and natural, they are restricted by the performance of machine translation system.…”
Section: Introductionmentioning
confidence: 99%