Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment - EMSEE '05 2005
DOI: 10.3115/1631862.1631865
|View full text |Cite
|
Sign up to set email alerts
|

Measuring the semantic similarity of texts

Abstract: This paper presents a knowledge-based method for measuring the semanticsimilarity of texts. While there is a large body of previous work focused on finding the semantic similarity of concepts and words, the application of these wordoriented methods to text similarity has not been yet explored. In this paper, we introduce a method that combines wordto-word similarity metrics into a text-totext metric, and we show that this method outperforms the traditional text similarity metrics based on lexical matching.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
150
0
1

Year Published

2006
2006
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 233 publications
(153 citation statements)
references
References 16 publications
2
150
0
1
Order By: Relevance
“…In line with many other researches (e.g., [5]), we determine these anchors using different similarity or relatedness models: the exact matching between tokens or lemmas, a similarity between tokens based on their edit distance, the derivationally related form relation and the verb entailment relation in WordNet, and, finally, a WordNet-based similarity [6]. Each of these detectors gives a different weight to the anchor: 0/1 for the first and the similarity value for all the others.…”
Section: Training Examples As Pairs Of Co-indexed Treessupporting
confidence: 64%
See 2 more Smart Citations
“…In line with many other researches (e.g., [5]), we determine these anchors using different similarity or relatedness models: the exact matching between tokens or lemmas, a similarity between tokens based on their edit distance, the derivationally related form relation and the verb entailment relation in WordNet, and, finally, a WordNet-based similarity [6]. Each of these detectors gives a different weight to the anchor: 0/1 for the first and the similarity value for all the others.…”
Section: Training Examples As Pairs Of Co-indexed Treessupporting
confidence: 64%
“…We compared three different kernels: (1) the kernel K l ((T , H ), (T , H )) = sim l (T , H ) × sim l (T , H ) which is based on the intra-pair lexical similarity sim l (T, H) described in [5]. (2) The kernel K l + K s that combines our kernel with the lexical-similarity-based kernel.…”
Section: Experimented Kernelsmentioning
confidence: 99%
See 1 more Smart Citation
“…The open source project WordNet::Similarity 5 implements all of these measures, and was used to compute the similarity scores [20]. As the focus in this study is on the comparison of short segments of text, rather than individual words, the word similarity scores are combined using the technique developed by Corley and Mihalcea [6]. Since the OSM Wiki website holds about 1,900 concept definitions, the complete, symmetric similarity matrix for OSM concepts would contain about 1.8 million rankings.…”
Section: Discussionmentioning
confidence: 99%
“…Anti-plagiarism techniques rely on the distributional hypothesis to detect suspiciously close citation patterns and similarities in writing styles across different text documents. More recently, the problem of paraphrase detection has become an active research area (Corley and Mihalcea 2005). For example, the sentence 'The Iraqi Foreign Minister warned of disastrous consequences if Turkey launched an invasion of Iraq' should be classified as a paraphrase of 'Iraq has warned that a Turkish incursion would have disastrous results' (Fernando and Stevenson 2008, p. 2).…”
Section: Text-to-text Semantic Similaritymentioning
confidence: 99%