Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing 2014
DOI: 10.3115/v1/w14-6821
|View full text |Cite
|
Sign up to set email alerts
|

Extended HMM and Ranking Models for Chinese Spelling Correction

Abstract: Spelling correction has been studied for many decades, which can be classified into two categories: (1) regular text spelling correction, (2) query spelling correction. Although the two tasks share many common techniques, they have different concerns. This paper presents our work on the CLP-2014 bake-off. The task focuses on spelling checking on foreigner Chinese essays. Compared to online search query spelling checking task, more complicated techniques can be applied for better performance. Therefore, we prop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 15 publications
(13 citation statements)
references
References 6 publications
0
13
0
Order By: Relevance
“…Conversely, if the matched pair made by the character and the character preceding it or following it is commonly seen in the text corpus, then that character's degree of erroneousness is very low here. Xiong et al (2014) proposed using the Hidden Markov Model (HMM) as the basis for a model to detect and correct erroneous characters. This method presupposes that unknown erroneous characters exist in the sentence, and seeks out each character's substitute character by means of phonetic writing (pinyin) and the Cangjie input code using Bayes' rule as its basis.…”
Section: Related Workmentioning
confidence: 99%
“…Conversely, if the matched pair made by the character and the character preceding it or following it is commonly seen in the text corpus, then that character's degree of erroneousness is very low here. Xiong et al (2014) proposed using the Hidden Markov Model (HMM) as the basis for a model to detect and correct erroneous characters. This method presupposes that unknown erroneous characters exist in the sentence, and seeks out each character's substitute character by means of phonetic writing (pinyin) and the Cangjie input code using Bayes' rule as its basis.…”
Section: Related Workmentioning
confidence: 99%
“…Our method combines different methods to improve performance. The main contributions compared with our previous work (Xiong et al, 2014) are:…”
Section: Introductionmentioning
confidence: 99%
“…This makes the model more robust since error detection is a nontrivial task for social media texts due to high number of slang, proper names (including colloquial) etc. By its architecture our model more resembles Xiong et al (2014), however, the set of features used differs significantly reflecting the difference between Chinese and Russian. As far as we know, our model is one of the first HMM-based systems used for spelling correction of a morphologically rich language.…”
Section: Previous Workmentioning
confidence: 99%