IFIP International Federation for Information Processing
DOI: 10.1007/978-0-387-34747-9_31
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of distance measures for historical spelling variants

Abstract: Abstract. This paper describes the comparison of selected distance measures in their applicability for supporting retrieval of historical spelling variants (hsv). The interdisciplinary project Rule-based search in text databases with nonstandard orthography develops a fuzzy fulltext search engine for historical text documents. This engine should provide easier text access for experts as well as interested amateurs. The FlexMetric framework enhances the distance measure algorithm found to be most efficient acco… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
12
0
1

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(13 citation statements)
references
References 8 publications
(6 reference statements)
0
12
0
1
Order By: Relevance
“…Studies on HDR have generally focused on the differences between historical and modern languages. OCR errors have been omitted from the experimental settings by using manually created or manually corrected test data (e.g., Braun et al., ; Gotscharek, Reffle, Ringsletter, Schulz, & Neumann, ; Hauser, Heller, Leiss, Schulz, & Wanzeck, ; Kempken et al., , Koolen et al., ; O'Rourke et al., ). An exception is Pilz, Luther, Fuhr, and Ammon (), who created rules for handling OCR errors both manually and automatically based on edit costs between character replacements.…”
Section: Related Researchmentioning
confidence: 99%
See 3 more Smart Citations
“…Studies on HDR have generally focused on the differences between historical and modern languages. OCR errors have been omitted from the experimental settings by using manually created or manually corrected test data (e.g., Braun et al., ; Gotscharek, Reffle, Ringsletter, Schulz, & Neumann, ; Hauser, Heller, Leiss, Schulz, & Wanzeck, ; Kempken et al., , Koolen et al., ; O'Rourke et al., ). An exception is Pilz, Luther, Fuhr, and Ammon (), who created rules for handling OCR errors both manually and automatically based on edit costs between character replacements.…”
Section: Related Researchmentioning
confidence: 99%
“…Kempken et al. () used an edit distance variant where the edit costs were automatically learned from the German historical document collection. They concluded that algorithms that are adapted to the specific historical phenomena of the collection can reach a better translation recall and precision than standard edit distance and n ‐grams (Kempken et al., ).…”
Section: Related Researchmentioning
confidence: 99%
See 2 more Smart Citations
“…Uma alternativaé empregar medidas de distâncias entre strings, que não requerem o VSM. Muitas dessas medidas foram definidas para diferentes fins e aplicações (Cohen et al, 2003;Gravano et al, 2001;Huang e Madey, 2004;Kempken et al, 2006 …”
unclassified