Proceedings of the 5th International Conference on Data Management Technologies and Applications 2016
DOI: 10.5220/0005926300570064
|View full text |Cite
|
Sign up to set email alerts
|

Performance Evaluation of Phonetic Matching Algorithms on English Words and Street Names - Comparison and Correlation

Abstract: Researchers confront major problems while searching for various kinds of data in a large imprecise database, as they are not spelled correctly or in the way they were expected to be spelled. As a result, they cannot find the word they are looking for. Over the years of struggle, relying on pronunciation of words was considered to be one of the practices to solve the problem effectively. The technique used to acquire words based on sounds is known as "Phonetic Matching". Soundex is the first algorithm proposed … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 5 publications
0
10
0
Order By: Relevance
“…Second, in real ground truth data based on comparing two independent transcriptions of the 1940 census, we find that while automated methods miss many links because of transcription differences, the links they create are almost 100% correct. Third, Bailey et al (2019) continues to report results that mix the Abramitzky, Boustan, and Eriksson approach with an outdated name standardizing algorithm (Soundex) that is not used in contemporary linking papers (see Koneru et al (2016), which suggests that NYSIIS result in fewer false positives than Soundex). Not surprisingly, this method is reported to have the highest false positive rates (43%).…”
Section: Linking Algorithmsmentioning
confidence: 99%
“…Second, in real ground truth data based on comparing two independent transcriptions of the 1940 census, we find that while automated methods miss many links because of transcription differences, the links they create are almost 100% correct. Third, Bailey et al (2019) continues to report results that mix the Abramitzky, Boustan, and Eriksson approach with an outdated name standardizing algorithm (Soundex) that is not used in contemporary linking papers (see Koneru et al (2016), which suggests that NYSIIS result in fewer false positives than Soundex). Not surprisingly, this method is reported to have the highest false positive rates (43%).…”
Section: Linking Algorithmsmentioning
confidence: 99%
“…It is possible to solve the specified issue by using phonetic algorithms that compare words such as the modified Metaphone algorithm given in Listing 1. It takes into consideration, first of all, peculiarities in the formation of Ukrainian last names (Table 1) and the titles of medicines [17], and, in contrast to those existing [4,5,11], makes it possible to form phonetic transformations for indexes of highly specialized Ukrainian words/terms.…”
Section: Discussion Of Results Of Studying a Phonetic Algorithm For Imentioning
confidence: 99%
“…An attempt to adapt several algorithms for indexing place names was made in [11]. Due to a small sample of the experimental data and a small divergence between the results from the proposed algorithms, it is difficult to give an objective estimation of the quality of operation of each separate one.…”
Section: Literature Review and Problem Statementmentioning
confidence: 99%
“…At last, if no match is found, we look for unique matches within a two-year duration period for people whose first and last name are the same based on the New York State Identification and Intelligence System Phonetic Code. This phonetic code has been shown to result in fewer false positives than alternative algorithms (Koneru et al, 2016). In each of these steps, only unique matches are accepted.…”
Section: B Appendix: Matching Attrition and Selectionmentioning
confidence: 99%