Proceedings - Natural Language Processing in a Deep Learning World 2019
DOI: 10.26615/978-954-452-056-4_067
|View full text |Cite
|
Sign up to set email alerts
|

A Qualitative Evaluation Framework for Paraphrase Identification

Abstract: In this paper, we present a new approach for the evaluation, error analysis, and interpretation of supervised and unsupervised Paraphrase Identification (PI) systems. Our evaluation framework makes use of a PI corpus annotated with linguistic phenomena to provide a better understanding and interpretation of the performance of various PI systems. Our approach allows for a qualitative evaluation and comparison of the PI models using human interpretable categories. It does not require modification of the training… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 16 publications
0
5
0
1
Order By: Relevance
“…The task of identifying paraphrases consists in determining whether two sentences have the same meaning, and can be casted-at least from a definitional perspective-as recognizing bidirectional entailments. Pruthi et al (2019) show that computational models underperform in MRPC (Dolan et al, 2004) with adversarial misspellings, and Kovatchev et al (2019) present a qualitative analysis of 11 state-of-the-art models (overall accuracies: 68-84%). When negation is present, however, accuracies drop to 33% (6 models) 67% (4 models) and 1% (1 model).…”
Section: Previous Workmentioning
confidence: 96%
“…The task of identifying paraphrases consists in determining whether two sentences have the same meaning, and can be casted-at least from a definitional perspective-as recognizing bidirectional entailments. Pruthi et al (2019) show that computational models underperform in MRPC (Dolan et al, 2004) with adversarial misspellings, and Kovatchev et al (2019) present a qualitative analysis of 11 state-of-the-art models (overall accuracies: 68-84%). When negation is present, however, accuracies drop to 33% (6 models) 67% (4 models) and 1% (1 model).…”
Section: Previous Workmentioning
confidence: 96%
“…In contrast, deep learning neural networks, such as bidirectional long short-term memory (BiLSTM; Graves & Schmidhuber, 2005) and Transformer neural networks such as Bidirectional Encoder Representation from Transformers or “BERT” and “distilled” “DistilBERT” (Sanh et al, 2020), do not require feature engineering. Deep learning neural networks exhibit better performance than traditional machine learning algorithms in a variety of language processing tasks such as paraphrase identification (Kovatchev et al, 2019). We tested whether machine learning and deep learning automated scoring systems can provide an end-to-end solution for scoring ToM whereby the system takes children’s open-ended responses to the Silent Film and Strange Stories tasks and returns assigned scores for each item, replacing the need for manual scoring.…”
Section: Measuring “Advanced” Tommentioning
confidence: 99%
“…p. ex. : Androutsopoulos & Malakasiotis, 2010 ;Klemen & Robnik-Šikonja, 2021 ;Kovatchev et al, 2020 ;Marsi et al, 2007 ;Martin, 1976 ;Mel'čuk, 1988Mel'čuk, , 1992Mel'čuk & Milićević, 2014 ;Milićević, 2007 ;Zhou et al, 2022), cf. aussi les travaux sur la similarité sémantique des textes dont les paraphrases sont un cas spécial (p. ex.…”
Section: Inférences Et Paraphrasesunclassified