Proceedings of the 19th International Conference on Computational Linguistics - 2002
DOI: 10.3115/1071884.1071898
|View full text |Cite
|
Sign up to set email alerts
|

Recognition assistance treating errors in texts acquired from various recognition processes

Abstract: Texts acquired from recognition sources-continuous speech/handwriting recognition and OCR-generally have three types of errors regardless of the characteristics of the source in particular. The output of the recognition process may be (1) poorly segmented or not segmented at all; (2) containing underspecified symbols (where the recognition process can only indicate that the symbol belongs to a specific group), e.g. shape codes; (3) containing incorrectly identified symbols. The project presented in this paper … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2008
2008
2017
2017

Publication Types

Select...
2
2
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…ere is a large body of literature on spelling correction for morphologically rich languages, for example [5] and [6], and similar approaches have been successfully applied to OCR post-processing, for example [7,8,9]. In our work, we wanted to improve the language model instead of using post-processing to correct errors, because post-processing cannot in principle give as good results as improved language modeling.…”
Section: Related Workmentioning
confidence: 99%
“…ere is a large body of literature on spelling correction for morphologically rich languages, for example [5] and [6], and similar approaches have been successfully applied to OCR post-processing, for example [7,8,9]. In our work, we wanted to improve the language model instead of using post-processing to correct errors, because post-processing cannot in principle give as good results as improved language modeling.…”
Section: Related Workmentioning
confidence: 99%
“…Some of the work in this area is related to lexical [14] [15], syntactic [16], morphologic [17], phonemic [18] [19], semantic [20] [21] levels or certain combination of them [22] [23]. In other prototypes as CLICK-TALP this task is performed manually [24] [25].…”
Section: Ambiguitymentioning
confidence: 99%
“…As a last example, there is a tool for morphology [16] that performs morphological and syntactic analysis with disambiguated segmentation (splits text into segments according to its coherence), special symbol disambiguation (used for sounds not related to words) and error correction for words misunderstood.…”
Section: Introductionmentioning
confidence: 99%