2007
DOI: 10.1109/tpami.2007.1078
|View full text |Cite
|
Sign up to set email alerts
|

A Normalized Levenshtein Distance Metric

Abstract: Although a number of normalized edit distances presented so far may offer good performance in some applications, none of them can be regarded as a genuine metric between strings because they do not satisfy the triangle inequality. Given two strings X and Y over a finite alphabet, this paper defines a new normalized edit distance between X and Y as a simple function of their lengths (|X| and |Y|) and the Generalized Levenshtein Distance (GLD) between them. The new distance can be easily computed through GLD wit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
316
0
11

Year Published

2011
2011
2023
2023

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 693 publications
(328 citation statements)
references
References 23 publications
1
316
0
11
Order By: Relevance
“…those that cannot be mapped to vector spaces. An example thereof is a set of text documents that use the edit metric Yujian & Bo (2007) to measure the distance between documents. Hjaltason & Samet (2003).…”
Section: Metric Access Methodsmentioning
confidence: 99%
“…those that cannot be mapped to vector spaces. An example thereof is a set of text documents that use the edit metric Yujian & Bo (2007) to measure the distance between documents. Hjaltason & Samet (2003).…”
Section: Metric Access Methodsmentioning
confidence: 99%
“…These words were selected to match the following criteria: (1) letters comprising the words must represent the English letters in as uniform distribution as possible, as shown in Fig.1; (2) they should be of different lengths; and (3) the similarity between words measured by Levenshtein distance [10] must be qualitatively variable, as demonstrated qualitatively in Fig.2. This limited vocabulary set can be used later for controlling machines or enabling the performance of various daily activities.…”
Section: A Corpus Designmentioning
confidence: 99%
“…The most common variations are the ones presented in Table 4. [29]). |x|, |y| are the lengths of the strings x,y, respectively.…”
Section: Text Matchingmentioning
confidence: 99%