2016
DOI: 10.1007/978-3-319-41718-9_8
|View full text |Cite
|
Sign up to set email alerts
|

Construction of a Russian Paraphrase Corpus: Unsupervised Paraphrase Extraction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
6
2
1

Relationship

2
7

Authors

Journals

citations
Cited by 22 publications
(8 citation statements)
references
References 14 publications
0
7
0
1
Order By: Relevance
“…For the above mentioned tasks we are going to use the following corpora: Paraphraser.ru (Pronoza et al, 2016) for the Russian language paraphrase identification task, Microsoft Research Paraphrase Corpus (Dolan et al, 2004) for the English language paraphrase identification task, Turkish Paraphrase Corpus (Demir et al, 2012) for the Turkish language paraphrase identification task; Russian Twitter Sentiment Corpus (Rubtsova, 2014) for the Russian language sentiment analysis task, Stanford Sentiment Treebank (Socher et al, 2013) for the English language sentiment analysis task; and Stanford Natural Language Inference (Bowman et al, 2015) for the English language natural language inference task.…”
Section: Methodsmentioning
confidence: 99%
“…For the above mentioned tasks we are going to use the following corpora: Paraphraser.ru (Pronoza et al, 2016) for the Russian language paraphrase identification task, Microsoft Research Paraphrase Corpus (Dolan et al, 2004) for the English language paraphrase identification task, Turkish Paraphrase Corpus (Demir et al, 2012) for the Turkish language paraphrase identification task; Russian Twitter Sentiment Corpus (Rubtsova, 2014) for the Russian language sentiment analysis task, Stanford Sentiment Treebank (Socher et al, 2013) for the English language sentiment analysis task; and Stanford Natural Language Inference (Bowman et al, 2015) for the English language natural language inference task.…”
Section: Methodsmentioning
confidence: 99%
“…Paraphraser [17] is a dataset for the paraphrasing task: it consists of sentence pairs, each of which is labeled as paraphrase, not paraphrase or maybe paraphrase. This task is close to DaNetQA as the model is required to detect linkage between sentences.…”
Section: Task Transferringmentioning
confidence: 99%
“…• English -Microsoft Research Paraphrase Corpus (Dolan et al, 2004) consists of 5,800 sentence pairs extracted from news sources on the web and manually labelled for presence/absence of semantic equivalence. • Russian -Russian Paraphrase Corpus (Pronoza et al, 2016) consists of news headings from different news agencies. It contains around 6,000 pairs of phrases labelled in terms of ternary scale: "-1" -not paraphrase, "0" -weak paraphrase, and "1"strong paraphrase.…”
Section: Paraphrase Detectionmentioning
confidence: 99%