Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics - 1999
DOI: 10.3115/1034678.1034712
|View full text |Cite
|
Sign up to set email alerts
|

A second-order Hidden Markov Model for part-of-speech tagging

Abstract: This paper describes an extension to the hidden Markov model for part-of-speech tagging using second-order approximations for both contextual and lexical probabilities. This model increases the accuracy of the tagger to state of the art levels. These approximations make use of more contextual information than standard statistical systems. New methods of smoothing the estimated probabilities are also introduced to address the sparse data problem.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
53
0

Year Published

2006
2006
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 96 publications
(53 citation statements)
references
References 16 publications
0
53
0
Order By: Relevance
“…Intuitively, higher order HMMs can be expected to be more accurate (Thede & Happer, 1999). However, the lack of coverage problem needs to be considered, as mentioned in Section 3.2.…”
Section: Hidden Markov Modelsmentioning
confidence: 99%
“…Intuitively, higher order HMMs can be expected to be more accurate (Thede & Happer, 1999). However, the lack of coverage problem needs to be considered, as mentioned in Section 3.2.…”
Section: Hidden Markov Modelsmentioning
confidence: 99%
“…POS tags were obtained using the tagger PoST [14], [15]. This is an HMM based tagger which performs discriminative reranking of N-best hypotheses using features derived from n-grams.…”
Section: Part-of-speech (Pos) Tag N-gramsmentioning
confidence: 99%
“…It is done in two stages: first, the contextual rules defined in SintaGest are applied; then, remaining ambiguities are suppressed with a statistical POS tagger based on a second-order hidden Markov model (HMM). This turns out to be a fast and efficient approach using the Viterbi algorithm [8,9]. The prior contextual and lexical probabilities were estimated by processing large, partially tagged corpora, among them the CETEMPúblico 1.7 collection of news from the Portuguese newspaper Público 5 .…”
Section: Software Toolsmentioning
confidence: 99%