1980
DOI: 10.1002/asi.4630310106
|View full text |Cite
|
Sign up to set email alerts
|

Automatic detection and correction of spelling errors in a large data base

Abstract: On‐line bibliographic search systems tend to increase the visibility of spelling errors through the use of indexes of unique terms; even low error rates in a data base can result in large numbers of misspelled terms in these indexes. This article describes the techniques used to detect and correct spelling errors in the data base of Chemical Abstracts Service. A computer program for spelling error detection achieves a high level of performance using hashing techniques for dictionary look‐up and compression. He… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0
1

Year Published

1982
1982
1999
1999

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(6 citation statements)
references
References 19 publications
0
5
0
1
Order By: Relevance
“…This approach uses n-grams to discover words that match and "nearly" match target terms, then add these additional terms to the original query. This approach has wide appeal since it could be largely language independent and could be applied to various concept (word) representations such as phonemes, soundex codes [14,13] or for spelling correction [12], using differing retrieval engines [2], or as a means to summarize the content of a document [3].…”
Section: N-grams Based Query Term Expansionmentioning
confidence: 99%
“…This approach uses n-grams to discover words that match and "nearly" match target terms, then add these additional terms to the original query. This approach has wide appeal since it could be largely language independent and could be applied to various concept (word) representations such as phonemes, soundex codes [14,13] or for spelling correction [12], using differing retrieval engines [2], or as a means to summarize the content of a document [3].…”
Section: N-grams Based Query Term Expansionmentioning
confidence: 99%
“…A variety of spelling-related applications have been proposed and examined: n-grams have been used to correct Morse code (McElwain & Evens, 1962); flag possible typing errors (Morris&Cherry, 1975;Zamora, 1980;Zamora et al, 1981); perform general spelling correction (Angell, 1983;Ullmann, 1977); aid in optical character recognition (OCR) classification (Hussain & Donaldson, 1974;Shinghal, Rosenberg, & Toussaint, 1978); correct spelling errors in OCR after classification (Cornew, 1968;Damerau, 1964;Thomas & Kassler, 1967); and to do both detection and correction of OCR spelling errors (Hanson et al, 1976;Hull & Srihari, 1982;Neuhoff, 1975;Riseman & Hanson, 1974;Vossler & Branston, 1964). An overview of spelling-related applications and results may be found in the excellent reviews by Suen ( 1979) and Kukich ( 1992).…”
Section: N-gramsmentioning
confidence: 99%
“…El análisis se ha centrado en la detecci6n de campos vacíos, datos mal situados, referencias duplicadas, normalizaci6n de las reglas de escritura (puntos, abreviaturas, etc.) y en la detecci6n de errores de ortografía y tipografía (13)(14)(15). No se ha hecho distinci6n entre estos dos últimos tipos de errores (16).…”
Section: Metodologíaunclassified