2014
DOI: 10.1162/tacl_a_00178
|View full text |Cite
|
Sign up to set email alerts
|

Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence

Abstract: We present a simple, easy-to-replicate monolingual aligner that demonstrates state-of-the-art performance while relying on almost no supervision and a very small number of external resources. Based on the hypothesis that words with similar meanings represent potential pairs for alignment if located in similar contexts, we propose a system that operates by finding such pairs. In two intrinsic evaluations on alignment test data, our system achieves F1 scores of 88–92%, demonstrating 1–3% absolute improvement ove… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
31
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 96 publications
(31 citation statements)
references
References 11 publications
0
31
0
Order By: Relevance
“…This is in contrast to SARI, which is designed to evaluate simplifications involving paraphrasing. EASSE re-factors the original SAMSA implementation 2 with some modifications: (1) an internal call to the TUPA parser (Hershcovich et al, 2017), which generates the semantic annotations for each original sentence; (2) a modified version of the monolingual word aligner (Sultan et al, 2014) that is compatible with Python 3, and uses Stanford CoreNLP (Manning et al, 2014) 3 through their official Python interface; and (3) a single function call to get a SAMSA score instead of running a series of scripts.…”
Section: Automatic Corpus-level Metricsmentioning
confidence: 99%
“…This is in contrast to SARI, which is designed to evaluate simplifications involving paraphrasing. EASSE re-factors the original SAMSA implementation 2 with some modifications: (1) an internal call to the TUPA parser (Hershcovich et al, 2017), which generates the semantic annotations for each original sentence; (2) a modified version of the monolingual word aligner (Sultan et al, 2014) that is compatible with Python 3, and uses Stanford CoreNLP (Manning et al, 2014) 3 through their official Python interface; and (3) a single function call to get a SAMSA score instead of running a series of scripts.…”
Section: Automatic Corpus-level Metricsmentioning
confidence: 99%
“…These systems correlate about 80% with similarity scores annotated by humans (Cer et al, 2017). Most systems integrate combinations of multiple algorithms providing partial scores from a number of aspects of the sentences (Sultan et al, 2014). For instance, a typical similarity score between sentences of a pair can be obtained as follows (Pilehvar and Navigli, 2015;Brychcın and Svoboda, 2016):…”
Section: Sts and The Distinction Between Sts Systems And Sentence Repmentioning
confidence: 99%
“…There is much less research related to the measurement of similarity between sentences or short text fragments (Islam and Inkpen, 2008). In order to evaluate the degree of two English sentences semantic similarity, Sultan et al exploited an unsupervised system that relied on word alignment (Sultan et al, 2014) or combined a vector similarity feature with alignment-based similarity (Sultan et al, 2015). Now quite a few researchers apply align words algorithms in order to compute the semantic similarity between two sentences.…”
Section: Related Workmentioning
confidence: 99%